Expression vector production and high-throughput cell screening

ABSTRACT

The present invention relates inter alia to expression vector production as well as application to the production of host cells for protein repertoire expression and high-throughput screening. The invention also relates to primers useful for PCR amplification of nucleotide sequences encoding human antibody variable domains.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.16/407,560, filed May 9, 2019, which is a division of U.S. patentapplication Ser. No. 14/917,236, filed Mar. 7, 2016, now U.S. Pat. No.10,337,000, which is a 35 U.S.C. § 371 National Phase Entry Applicationof International Application No. PCT/GB2014/052836, filed Sep. 18, 2014,which claims priority under 35 U.S.C. § 119(e) to GB Application No.1316644.2, filed Sep. 19, 2013, the contents of each of which areincorporated herein by reference in their entireties.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted in ASCII format via EFS-Web and is hereby incorporated byreference in its entirety. Said ASCII copy, created on Jun. 1, 2022, isnamed 730297_SA9-624USDIVCON_ST25.txt and is 22.2 kilobytes in size.

The present invention relates inter alia to expression vector productionas well as application to the production of host cells for proteinrepertoire expression and high-throughput screening. The invention alsorelates to nucleic acids, PCR primers and mixtures useful for PCRamplification of nucleotide sequences encoding human antibody variabledomains or homologous recombination with Ig loci in cells.

BACKGROUND

The art recognises the desirability of screening repertoires of cells orproteins to identify proteins of interest (POI) or cells expressingthese. For example, the art comprises techniques to screen B-cells andrepertoires of antibodies, in order to identify one or more cells thatexpress antibodies displaying a desired characteristic (typicallyspecific antigen binding). Screening is not confined to antibodyscreening, but also may be applicable to screening collections of othertypes of proteins for one or more desired characteristics.

In order to express protein repertoires, corresponding nucleotidesequences can be cloned into respective expression vectors andintroduced (e.g., transfected) into host cells that can express theproteins. Commonly, molecular cloning is used to clone protein-encodingsequences from a repertoire into host cells. Molecular cloning has beenapplied to techniques of B-cell (lymphocyte) screening wherein B-cellsare expanded by culturing, sorted into single cells (e.g., using ahaemolytic plaque assay or fluorescent foci assay), antibody chain (orvariable region) mRNA from the sorted cells is reverse transcribed andamplified using RT-PCR, amplified DNA undergoes molecular cloning tointroduce antibody chain-encoding sequences into nucleic acid vectorsand the vectors are then introduced into host cells for transientexpression (where the vector is episomal) or for stable expression (byrandom integration into the host cell genome). See, for example, ProcNatl Acad Sci USA. 1996 Jul. 23; 93(15):7843-8, “A novel strategy forgenerating monoclonal antibodies from single, isolated lymphocytesproducing antibodies of defined specificities”, J S Babcook et al (knownas the SLAM technique) and doi: 10.1016/j.jala.2009.05.004 Journal ofLaboratory Automation October 2009 vol. 14 no. 5 303-307,“High-Throughput Screening for High Affinity Antibodies”, S Tickle et al(known as the UCB SLAM technique).

Molecular cloning involves PCR amplifying cDNA (produced by RT-PCR ofmRNA of POIs) with primers carrying restriction enzyme cloning sites.This produces PCR products in which each POI nucleotide sequence isflanked by restriction sites. The PCR products are then digested withthe appropriate restriction enzymes and subcloned into empty expressionvectors which provide a promoter and polyA signal by ligation as well ascarry a selection marker. The ligated products are then transformed intoE. coli and clones are selected by the presence of the marker. It isnecessary to confirm and select clones with correct sequence insertionsby laborious restriction mapping and sequencing of individual clones.Each single correct clone expression vector is then individuallypurified from E. coli and transfected into a host cell (e.g., a CHO orHEK293 mammalian cell) for transient or stable expression.

Molecular cloning is, therefore, a laborious technique involvingmultiple steps and is not well suited to high-throughput.

Transient expression, such as from a linear expression cassette, isusually limited and stable expression is preferred for longer-termexpression. For stable expression, typically the POI-encoding sequenceof interest is inserted into the mammalian genome by the classicalintegration method of spontaneous integration of foreign DNA (i.e.,random integration in the genome). This approach often leads tosignificant transcriptional variation (and thus unpredictability andinconsistent POI expression) as a result of differences in the transgenecopy number and the site of integration (see Henikoff S (1992),“Position effect and related phenomena”, Curr Opin Genet Dev 2(6):907-912; Martin D I, Whitelaw E (1996), “The variegating transgenes”,Boessays 18(11): 919-923; and Whitelaw E et al. (2001), “Epigeneticeffects on transgene expression”, Methods Mol Biol. 158: 351-368). Inaddition, transgene fragments integrated in this way are often found tobe inserted as concatemers which can result in gene inactivation byrepeat-induced gene silencing (see Garrick D et al (1998),“Repeat-induced gene silencing in mammals”, Nat Genet 18(1): 56-59; andMcBurney M W et al (2002), “Evidence of repeat-induced gene silencing incultured Mammalian cells: inactivation of tandem repeats of transfectedgenes”, Exp Cell Res 274(1): 1-8). These problems hamper the productionof repertoires of POIs and cell populations for expressing such POIrepertoires.

Many technologies exist for the generation of monoclonal antibodies(mAbs) from human or transgenic animals carrying an antibody repertoire.Generally, mAbs are obtained from immortalization of B cells either byfusion (hybridoma technology) or transformation (virus transfection oroncogene transformation). These cell immortalization methods, however,are unsuitable for a comprehensive screening of large antibodyrepertories, because they are highly biased, inefficient and typicallyonly sample a minute proportion of the available repertoire (typicallyless than 0.1% (immortalised cells/input cells) of a B-cell repertoireobtained from an immunised mouse, for example). The use of alternativeB-cell screening methods that do not require immortalisation, therefore,is attractive, but techniques currently face the problems discussedabove.

SUMMARY OF THE INVENTION

The invention addresses the need for techniques of vector productionthat are amenable to high-throughput, reliable production suitable forrepertoires and screening, particularly with the potential forautomation.

Thus, a first configuration of the present invention provides:—

In a first aspect,

A method of producing cells encoding a repertoire of proteins ofinterest (POI), the method comprising:—

-   -   a) Providing a population of cells expressing a repertoire of        POIs;    -   b) Sorting the population of cells to produce a sorted        population of single cells, each cell comprising nucleic acid        encoding a respective POI;    -   c) Amplifying the nucleic acid comprised by the sorted single        cell population to produce a sorted repertoire of amplified        nucleic acids encoding POIs;    -   d) Modifying sorted amplified POI-encoding nucleic acids from        step (c) to produce a sorted repertoire of expression cassettes,        each cassette comprising a nucleotide sequence encoding a POI        and one or more regulatory elements for expressing the POI; and    -   e) Transferring POI expression cassettes from said cassette        repertoire to a sorted population of host cells while        maintaining the POI expression cassette sorting and producing a        sorted repertoire of host cells that expresses a sorted        repertoire of POIs.

In an embodiment, sorting is maintained in step (e) by provision of therepertoires in a plurality of containers whose relative locations andoverall arrangements are fixed. This enables high-throughput processingand automation of the method of the invention, e.g., for efficient andrapid cell screening to select one or more POI sequences of interest.Additionally or alternatively transfer of the expression cassettes instep (e) is carried out by batch transfer, and this too enableshigh-throughput processing and automation of the method of theinvention.

In a second aspect,

An automated apparatus for performing the method of any preceding aspector concept, the apparatus comprising

-   -   a. Means for holding a sorted single cell population in a        plurality of containers wherein each single cell is in a        respective container, each cell comprising nucleic acid encoding        a respective POI;    -   b. Means for delivering PCR reagents to the containers for        amplifying nucleic acid comprised by the sorted single cell        population to produce a sorted repertoire of amplified nucleic        acids encoding POIs;    -   c. Means for delivering to the containers reagents for modifying        sorted amplified POI-encoding nucleic acids to produce a sorted        repertoire of expression cassettes, each cassette comprising a        nucleotide sequence encoding a POI and one or more regulatory        elements for expressing the POI;    -   d. Means for holding a sorted population of host cells in a        plurality of containers;    -   e. Means for transferring POI expression cassettes from said        cassette repertoire to the sorted population of host cells in        the containers while maintaining the POI expression cassette        sorting; and    -   f. Means for carrying out transfection of expression cassettes        into host cells in the containers to produce a sorted repertoire        of host cells that expresses a sorted repertoire of POIs.

A second configuration of the present invention provides:—

In a first aspect,

An expression cassette for expression of a POI in a host cell, thecassette being provided by linear nucleic acid comprising a transposon,the transposon comprising 5′- and 3′-terminal transposon elements with aPOI-encoding nucleotide sequence and regulatory element(s) for POIexpression between the transposon elements.

In a second aspect,

A sorted population of expression cassettes encoding a repertoire ofPOIs corresponding to POIs expressed by a population of cells, eachcassette comprising a nucleotide sequence encoding a member of therepertoire of POIs and one or more regulatory elements for POIexpression, wherein each said cassette comprises the arrangement (in 5′to 3′ direction): transposon element-[POI nucleotide sequence &regulatory element(s)]-transposon element, and expression cassettes forexpression of POIs from different cells are isolated from each other inthe sorted population (e.g., in different wells of a plate).

In a third aspect,

A method of making a transposon comprising a nucleotide sequence ofinterest (NOI), the method comprising

-   a. Providing a first nucleotide sequence comprising (in 5′ to 3′    direction) A, B and C, wherein A is a first homology sequence, B is    a nucleotide sequence comprising the NOI and C is a second homology    sequence;-   b. Providing a first template nucleotide sequence comprising (in 5′    to 3′ direction) W and X, wherein W is a nucleotide sequence    comprising a first transposon element and X is a third homology    sequence; and-   c. Providing a second template nucleotide sequence comprising (in 5′    to 3′ direction) Y and Z, wherein Y is a fourth homology sequence    and Z is a nucleotide sequence comprising a second transposon    element; and either-   d. (i) Mixing the first nucleotide sequence with the first template    to hybridise the first and third homology arms together and carrying    out nucleic acid amplification and extension to extend the first    nucleotide sequence using the first template to produce a first    extended nucleotide sequence (first ENS) comprising (in 5′ to 3′    direction) W, B and C; and (ii) mixing the first ENS with the second    template to hybridise the second and fourth homology arms together    and carrying out nucleic acid amplification and extension to extend    the first ENS to produce a second ENS comprising ((in 5′ to 3′    direction) W, B and Z; or    -   (ii) Mixing the first nucleotide sequence with the second        template to hybridise the second and fourth homology arms        together and carrying out nucleic acid amplification and        extension to extend the first nucleotide sequence using the        second template to produce a third extended nucleotide sequence        (third ENS) comprising (in 5′ to 3′ direction) A, B and Z;        and (ii) mixing the third ENS with the first template to        hybridise the first and third homology arms together and        carrying out nucleic acid amplification and extension to extend        the third ENS to produce a fourth ENS comprising ((in 5′ to 3′        direction) W, B and Z; or    -   (iii) Mixing the first nucleotide sequence with the first and        second templates to hybridise the first and third homology arms        together and to hybridise the second and fourth homology arms        together and carrying out nucleic acid amplification and        extension to extend the first nucleotide sequence using the        second template to produce a fifth ENS comprising ((in 5′ to 3′        direction) W, B and Z;    -   and-   e. Isolating an ENS comprising (in 5′ to 3′ direction) W, B and Z,    thereby producing an isolated transposon comprising a NOI flanked by    transposon elements; and-   f. Optionally introducing the isolated transposon into a recipient    cell so that the transposon integrates into the genome of the cell.

A third configuration of the present invention provides:—

A method of producing a host cell for expression of a POI, the methodcomprising

-   -   a. Providing at least first and second expression cassettes,        wherein each expression cassette comprises        -   i. a first integration element and a second integration            element 3′ of the first integration element nucleotide            sequence; and        -   ii. between the integration elements a nucleotide sequence            encoding a POI and one or more regulatory elements for            expressing the POI;        -   iii. wherein the integration elements are capable of            insertion into a nucleic acid by recognition of a            predetermined nucleotide sequence motif of the nucleic acid            using an integration enzyme;    -   b. Providing a host cell whose genome comprises a plurality of        said motifs; and    -   c. Simultaneously or sequentially introducing the first and        second expression cassettes into the host cell, wherein each        cassette is genomically-integrated in the host cell genome at a        said motif for expression of POIs by the host cell; and    -   d. Optionally producing a cell line expression POIs in a step        comprising culturing the host cell.

A fourth configuration of the present invention provides:—

A nucleic acid mixture comprising a first isolated nucleic acid and asecond isolated nucleic acid, wherein the first nucleic acid is capableof hybridising to a human antibody V region 5′UTR sequence of a genecomprised by a target nucleic acid, wherein the gene encodes a human Vregion; and the second nucleic acid is capable of hybridising to asecond sequence, wherein the second sequence is comprised by the targetnucleic acid and is 3′ to the UTR sequence, wherein the first isolatednucleic acid comprises a sequence that is at least 90% identical to asequence selected from the group consisting of SEQ ID NOs: 1-47.

A nucleic acid mixture comprising a first isolated nucleic acid and asecond isolated nucleic acid, wherein the nucleic acids are differentand selected from nucleic acids comprising a sequence that is at least90% identical to a sequence selected from the group consisting of SEQ IDNOs: 1-47.

A method of amplifying a repertoire of human variable region sequencesusing one or more of the sequences.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1A is a schematic illustrating a non-limiting example of the methodof the invention for producing a repertoire of antibody binding sites(POIs) in host cells (e.g., CHO or HEK293 cells).

FIG. 1B is an alternative schematic illustrating a non-limiting exampleof the method of the invention for producing a repertoire of antibodybinding sites (POIs) in host cells (e.g., CHO or HEK293 cells).

FIGS. 2A-2B depict examples of gating strategy for different populationsof B cells. (FIG. 2A) An example of a flow cytometric contour plotdisplaying the gating of CD38+ CD95+ memory cell population after theexclusion of IgM and IgD B cells. (FIG. 2B) Each individual memory cellpositive for CD19 (Pacific Blue) and antigen (Ovalbumin-AlexaFluor-488)is then sorted into separate wells in a 96-well plate.

FIG. 3 : A flowchart summarizing the high-throughput production ofantibodies from single B cells. Single cell-based cDNA synthesis wasperformed using constant region-specific primers. Mixture of Vgene-specific primers for heavy chain and light chain with a humancytomegalovirus (hCMV) promoter fragment at the 5′ end were used for thefirst round PCR to amplify the V gene fragment. A generic forward primerthat annealed to the hCMV tag was used with a reverse nested primer forthe constant region for the second round PCR. The amplified productswere then bridged with linear Ig-cassette with 5′ PB LTR-CMV promoterand constant region-polyA signal-3′ PB LTR.

FIG. 4 : Production of heavy and light chain expression constructs bybridge PCR. (a) Ethidium bromide-stained agarose gels of the pairs ofV_(H) and V_(L) gene fragments amplified from single B cells. Each lanecontains 10 μL of a 30 μL V_(H)+V_(L) second round PCR product. (b)Ethidium bromide-stained agarose gels showing the bridge PCR results.Each lane contains 10 μL of a 30 μL bridge product for Ig-expression.The negative controls showed the Ig-cassette of heavy and light chains.Representative PCR products from the high-throughput 96-well PCRplatform were shown here.

FIG. 5 : Analysis of the antibody sequences of the sorted Ag-specificsingle B-cells from project 1. The antibody sequences expressed byindividual B cells were arranged by heavy-chain V-gene family usage andclustered to generate the displayed phylogenetic trees.

FIG. 6 : An example of clustered family which showed the affinitymaturation Via hypermutation for both apparent affinity andneutralization potency. CNROR: unable to resolve off rate.

FIG. 7 : Expression level boosted by PiggyBac transposon system.Antibody expression of the transfected bridge PCR products in HEK293cells was tested using co-transfection with different amount of PBase:0; 20; 100; 500 ng/well. Supernatants in each well were collected on day0; day 2; and day 5. Data showed that the transposon system increasedthe expression level by 2-4 times on day 5.

FIG. 8 : Example of quantification of the IgG concentration insupernatant of HEK293 cells transfected with bridge PCR products. Theconcentration was determined by a sandwich ELISA eight days after thetransfection. The concentration is from about 200 ng/mL to 3000 ng/mLwhich is comparable to the concentration in the supernatant of prior arthybridoma technologies.

FIGS. 9A and 9B: Representative SPR sensorgrams of antibodies binding toovalbumin. Around two thirds of antibodies tested showed evidence ofbinding to the antigen, with a diverse range of distinct bindingaffinities and kinetics.

FIG. 10 : The apparent affinity of antibodies against two differenttarget antigens. A range of binders (opened) as well as functionalneutralisers (filled) were detected, with the highest affinity detectedin the picomolar range. This experiment validates our single B cellcloning technology to be a powerful tool in the identification andretrieval of high affinity and functionally competent antibodies.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 is a schematic illustrating an example of the method of theinvention for producing a repertoire of antibody binding sites (POIs) inhost cells (e.g., CHO or HEK293 cells), which is useful in a process ofB-cell screening. The host cell repertoire produced at the end of theschematic in FIG. 1 is useful since the cells express antibody bindingsites (e.g., V_(H)/V_(L) pairs) that can be screened against apredetermined antigen to identify which cells express binding sites ofinterest. This thus enables genotype to phenotype linkage, allowingidentification of nucleotide sequence that encodes a binding site ofinterest. This nucleotide sequence can then be used to construct celllines for the stable production of desired antibody, e.g., to produceantibody pharmaceuticals or diagnostic or research reagents targetingthe predetermined antigen. In an example, the host cell produced at theend of the schematic of FIG. 1 is itself a cell that stably expressesthe binding site and this offers the possibility of convenient andefficient stable cell line generation for longer term manufacture of thebinding site. As discussed further below, in an example site-mediatedinsertion of POI-encoding expression cassettes can be performed in hostcells, which provides the added advantage of stable POI expression thatavoids random integration and concatemers.

While examples are provided in terms of antibody binding sites, antibodyvariable domains and chains, the invention is not limited to suchproteins. In this sense, the invention is applicable to any proteins ofinterest (POI). The term “protein” in this context also includespolypeptides and peptides which are of interest, as well as protein orpolypeptide fragments or domains.

Turning to the non-limiting example of FIG. 1 , in a first step, adesired B-cell population is provided, which may be germinal cells,memory cells, plasmablasts, plasma cells or generally may beantibody-secreting cells (ASCs), or mixtures of two or more of theseB-cell types. The skilled person is familiar with selecting suchpopulations, for example using cell surface markers optionally withFACS. Such markers may be selected from CD19, IgM, IgD, CD30 and CD95for example, and may include a panel of one or more of these (eg. 1, 2,3, 4), or may include a panel of all of these markers. Cells bearingthese markers may be stained, for example, with any small moleculefluorophore which can be detected by the cell sorting system, such asAlexa-488, Alexa-647, Pacific Blue, R-phycoerythrin, fluoresceinisothiocyanate, or allophycocyanin optionally conjugated to a cyaninedye, e.g. Cy7. The B-cell population may be generated in a preliminarystep by isolation from one or more animals (e.g., mice, rats or humansor non-human animals), for example an animal that has been immunisedwith a target antigen. Thus, the method of the invention is useful forscreening B-cell populations to identify one or more POI sequencesderived from a B-cell wherein the POI binds the target antigen with adesired characteristic. Such a characteristic is, e.g., binding to thetarget antigen and/or a structurally-related, homologous or orthologousantigen (e.g., one from a different species); and/or binding an antigenwith a desired affinity; and/or competition with a known antibody forbinding a predetermined antigen (e.g., the target antigen); and/orbinding a predetermined epitope.

Optionally, the desired input B-cell population (e.g., plasmablasts orplasma cells) can be selected by performing antigen-specific cellsorting using techniques known in the art (e.g., see Jin, A et al.,Nature Medicine, 2009, 15(9): 1088-1093). The output is a population ofB-cells that express antibody binding sites that specifically bind apredetermined target antigen. The inventors have found it advantageousto perform this step to streamline the overall B-cell screening process,thereby allowing for efficient and higher throughput screening.

Generally, antigen-specific GC (germinal centre) or memory B cells maybe captured by labelled antigen because they dominantly expresstransmembrane antibodies on the cell surface. The antigen may befluorescently labelled (for example, any small molecule fluorophorewhich can be detected by the cell sorting system, such as Alexa-488,Alexa-647, Pacific Blue, R-phycoerythrin, fluorescein isothiocyanate, orallophycocyanin optionally conjugated to a cyanine dye, e.g. Cy7). Ifthe B cell population has been previously stained for initial selection,it is advantageous to use a fluorophore having a different emissionwavelength to the fluorophore used in that initial selection, in orderto facilitate the sorting process, e.g. using FACS. In an alternative,the cell may be simultaneously stained for presence of the cellselection marker and screened for binding to fluorescentantigen/antigen-bearing VLPs. Without being limited to theory, it isthought that, on the other hand, plasmablast or plasma cells would notbe as easily captured by labelled antigen, because of their dominantexpression of secreting antibody.

In an alternative embodiment, the antigen may be captured withvirus-like particles (VLPs) with recombinant antigen on the surface.VLPs may be generated from CHO cells, KEK cells, mouse embryonicfibroblasts (MEFs) or other mammalian cell lines with co-expression ofthe recombinant antigen, the retrovirus gag protein, and MA-GFP (gagmatrix fragment p15-GFP fusion protein). The gag protein expressionenables VLP to bud from cells, and the MA-GFP labels the VLPs forfluorescence detection. Both gag and MA-GFP proteins are associated withthe inner surface of the plasma membrane, and recombinant antigen is onthe VLP surface. The antigens on the VLPs are presented in native form,directly expressed from recombinant cells without any step ofpurification or modification. The native form of an antigen shouldprovide all the natural epitopes which greatly help selection ofneutralizing antibodies. The high density of the antigen on the VLPsincreases the signal/noise ratio for detection of cells expressingantigen-specific antibodies on the cell surface and greatly facilitatesthe sorting step. The recombinant VLPs can be generated with expressionof different fluorescent proteins, such as MA-CFP or MA-YFP. Usingmultiplexing of VLPs with different antigen and different fluorescenceprotein, cells expressing high affinity binders, cross-reactive bindersor homolog-specific binders can be selected. The cells expressing highaffinity binders may be selected by cells with relative high affinitymatrix (affinity matrix=the ratio of binding activity to low densityantigen over to high density antigen VLPs). The cells expressingcross-reactive binders to orthologs or different antigens (for 2-in-1bi-specific antibody isolation) can be selected by cells binding todifferent types of VLPs at the same time. The cells expressinghomolog-specific binders also can be selected by cell only binding tospecific antigen but not its homolog.

B-cells are thus sorted (e.g., using FACS) to provide a population ofsorted, single B-cells. Typically, the B-cells are sorted into wells ofa standard plate (e.g., a standard 96-well or 364-well plate) so thateach single cell is in a respective well and not mixed with anothercell. It is possible for there to be a minimal number (e.g., less than5%, less than 3%, less than 2% or less than 1%, or non-detectablelevels) of wells having no or more than one (e.g., two) cells, and thisdoes not hamper the overall utility of the screening method (and is notconsidered part of the desired repertoire). Preferably, each well on theplate contains a single B-cell. Optionally, it is possible to culturecells directly before and/or after the cell sorting step, but theinventors have found this not be necessary unlike techniques in the art.Thus, by avoiding this step the method of the invention lends itselffurther to streamlining and high throughput.

Next, in the example shown in FIG. 1 , POI-encoding nucleic acid isamplified. In the example, this is performed by reverse transcribingmRNA in cells of the sorted cell population (i.e., POI-encoding mRNA isconverted to corresponding POI-encoding cDNA) and this is amplifiedusing PCR. Standard RT-PCR can be used as will be familiar to theskilled person (e.g., see Dixon A K et al, Trends Pharmacol. Sci., 2000,21(2): 65-70). This yields a repertoire of DNA encoding a repertoire ofantibody variable regions. In this example, both V_(H) and V_(L)sequences are copied and amplified for each sorted cell. The skilledperson will know that each cell expresses a single type of antibodybinding site, thus only a single type of amplified V_(H) sequence and asingle type of V_(L) sequence will be amplified per single sorted cell(thus, where wells are used, only a single V_(H) and V_(L) type per wellbe obtained, which thereby retains the grouping of V_(H) and V_(L)sequences forming binding sites).

POI-encoding sequences are also modified to produce a repertoire ofexpression cassettes for expressing POIs from host cells in a laterstage of the method. The expression cassettes contain a POI-encodingnucleotide sequence and one or more regulatory elements for expression(e.g., a promoter and/or an enhancer and/or a polyA). In an embodiment,the amplification and modification steps can be performed simultaneouslyusing PCR and appropriate templates and primers as will be apparent tothe skilled person. In another embodiment, amplification andmodification are carried out in separate steps, e.g., amplification andthen modification. In the example of FIG. 1 , RT-PCR is first carriedout and then the amplified POI-encoding sequences are modified toproduce expression cassettes each comprising (in 5′ to 3′ direction):promoter-POI nucleotide sequence-polyA. In this example, flankingtransposon elements are also added. The transposon elements can bepiggyBac (PB) transposon inverted terminal repeat elements to form atransposon nucleic acid comprising the structure: [5′ PBelement]-[promoter]-[POI nucleotide sequence]-[polyA]-[3′ PB element].In an example, each expression cassette is provided by linear DNAcomprising or consisting of the transposon.

PB 5′ element (SEQ ID NO: 55):GATATCTATAACAAGAAAATATATATATAATAAGTTATCACGTAAGTAGAACATGAAATAACAATATAATTATCGTATGAGTTAAATCTTAAAAGTCACGTAAAAGATAATCATGCGTCATTTTGACTCACGCGGTCGTTATAGTTCAAAATCAGTGACACTTACCGCATTGACAAGCACGCCTCACGGGAGCTCCAAGCGGCGACTGAGATGTCCTAAATGCACAGCGACGGATTCGCGCTATTTAGAAAGAGAGAGCAATATTTCAAGAATGCATGCGTCAATTTTACGCAGAGTATC TTTCTAGGGTTAAPB 3′ element (SEQ ID NO: 56):TTTGTACTTATAGAAGAAATTTTGAGTTTTTGTTTTTTTTTAATAAATAAATAAACATAAATAAATTGTTTGTTGAATTTATTATTAGTATGTAAGTGTAAATATAATAAAACTTAATATCTATTCAAATTAATAAATAAACCTCGATATACAGACCGATAAAACACATGCGTCAATTTTACGCATGATTATCTTTAACGTACGTCACAATATGATTATCTTTCTAGGGTTAA

As shown in the example of FIG. 1 , bridge PCR (see, e.g., Mehta R. K.,Sihgh J., Biotechniques, 1999, 26(6):1082-1086) can be used to constructthe transposon by adding the flanking transposon elements.

At this stage of the method, one has obtained a repertoire of expressioncassettes encoding a repertoire of POI-encoding nucleotide sequencesderived from the input population of cells. In this example, theamplification and modification procedures are carried out on the singlecells that are sorted in wells on one or more plates. Thus, the resultis a sorted repertoire of expression cassettes (in this case per singlewell, the well contains a plurality of one type of V_(H) expressioncassette and one type of V_(L) expression cassette derived from one typeof antibody binding site of a single input cell). Next, the sortedexpression cassettes (in the present example, contained in respectivetransposons and optionally as linear DNAs) are transferred to host cellsto produce a sorted repertoire of host cells that expresses a sortedrepertoire of POIs. This can be performed, in the present example, bytransferring expression cassettes from wells on one plate (or one set ofplates) to one or more other plates having a plurality of cellscontaining host cells (e.g., identical types of cells of the same cellline, e.g., CHO or HEK293 or yeast cells). In this embodiment, thesorted cassette repertoire can be transferred manually or with a robotusing a multi-channel pipette (e.g., a 4, 8, 12, 16 (2×8), 64 (8×8), 96(12×8), 384 or 1536 channel pipette as is known in the art) tosimultaneously retrieve cassettes (in this case sorted VH/VL pairs ofcassettes) from individual wells on a first plate and transfer tocorresponding wells on a second plate, wherein those wells contain hostcells. By doing this, the relative location and sorted nature of theexpression cassettes is maintained when transferring to host cells. Thishas the advantage of maintaining linkage of V_(H)/V_(L) cassettesequence pairs without mixing (i.e., per well on the second plate, thewell contains host cells and V_(H) and V_(L) sequences derived from asingle input B-cell only). Advantageously, also if some cassette sampleis left on the first plate, one can then easily identify one or morewells as a source of useful POI-encoding sequence (and expressioncassette and transposon) by reference to desired positives found on thesecond plate following screening of the second plate for a desiredantibody binding site characteristic. Also, advantageously one canprovide the host cells in a medium that desirably supports functioningand maintenance of the host cells (typically different from theenvironment used in the cells of the first plate for sequenceamplification and modification). In an alternative embodiment, hostcells are added to wells of the plate containing the expression cassetterepertoire so that the sorted arrangement is retained (but this thendoes not provide the additional advantages of the other embodiment wherea master plate of expression cassettes is retained without mixture withhost cells—useful, for example, for carrying out a second transfer toanother plate having a different type of host cell, such as when onewants to assess the performance of POI expression and display/secretionby a different host cell type).

An example of an automated multi-channel pipette is the ThermoScientific Matrix Hydra II 96-Channel Automated Liquid Handling System.V&P Scientific VP 177AD-1 and VP 179BJD are dispensing manifoldsdesigned for rapid filling of 96 and 384 well plates respectively, andeither of these can be used in the method of the invention to transfercassettes to host cells or for general sample handling.

The output of the method of the invention, therefore, in its broadestaspect is the production of a repertoire of host cells that expresses arepertoire of POIs. As discussed above, this can then be used in asubsequent step of screening (e.g., using a cell binding assay, ELISA,surface plasmon resonance or other assay appropriate for the particularnature of the POI) to identify one or more host cells that expresses aPOI with a desired characteristic. One is then able to isolate POI fromthe cell or surrounding medium and/or isolate (and optionally replicateor amplify) a nucleotide sequence (e.g., DNA sequence) encoding thedesired POI. The nucleotide sequence can be determined. Such isolatedcells can be cultured to produce a POI-encoding cell line (which isespecially useful when the cassette has been stably integrated into thecell genome, as is possible with site-directed genomic integration,e.g., performed using a transposon). The POI-encoding nucleotidesequence can be inserted into a different expression vector and/ormutated (e.g., affinity matured) or fused to another protein sequence.

Transposon integration is effected by providing a correspondingtransposase enzyme in the host cells (e.g., by co-transfection ofexpression cassettes with a vector comprising an expressible transposasegene or by providing host cells that harbour such a tranposase gene,e.g., inducibly). In an example, each cassette comprises piggyBactransposon elements and the method uses a piggyBac transposase (e.g.,wild-type or hyperactive piggyBac transposase; see, e.g., Yusa, K etal., PNAS USA, 2011, 10 (4):1531-1536; and Yusa K et al., “A hyperactivepiggyBac transposase for mammalian applications”, PNAS USA, 2010,108(4):1531-1536).

WT PBase (SEQ ID NO: 57):MGCSLDDEHILSALLQSDDELVGEDSDSEISDHVSEDDVQSDTEEAFIDEVHEVQPTSSGSEILDEQNVIEQPGSSLASNRILTLPQRTIRGKNKHCWSTSKSTRRSRVSALNIVRSQRGPTRMCRNIYDPLLCFKLFFTDEIISEIVKWTNAEISLKRRESMTGATFRDTNEDEIYAFFGILVMTAVRKDNHMSTDDLFDRSLSMVYVSVMSRDRFDFLIRCLRMDDKSIRPTLRENDVFTPVRKIWDLFIHQCIQNYTPGAHLTIDEQLLGFRGRCPFRMYIPNKPSKYGIKILMMCDSGTKYMINGMPYLGRGTQTNGVPLGEYYVKELSKPVHGSCRNITCDNWFTSIPLAKNLLQEPYKLTIVGTVRSNKREIPEVLKNSRSRPVGTSMFCFDGPLTLVSYKPKPAKMVYLLSSCDEDASINESTGKPQMVMYYNQTKGGVDTLDQMCSVMTCSRKTNRWPMALLYGMINIACINSFIIYSHNVSSKGEKVQSRKKFMRNLYMSLTSSFMRKRLEAPTLKRYLRDNISNILPNEVPGTSDDSTEEPVMKKRTYCTYCPSKIRRKANASCKKCKKVICREHNIDMCQSCFHyperactive PBase (SEQ ID NO: 58):MGSSLDDEHILSALLQSDDELVGEDSDSEVSDHVSEDDVQSDTEEAFIDEVHEVQPTSSGSEILDEQNVIEQPGSSLASNRILTLPQRTIRGKNKHCWSTSKPTRRSRVSALNIVRSQRGPTRMCRNIYDPLLCFKLFFTDEIISEIVKWTNAEISLKRRESMTSATFRDTNEDEIYAFFGILVMTAVRKDNHMSTDDLFDRSLSMVYVSVMSRDRFDFLIRCLRMDDKSIRPTLRENDVFTPVRKIWDLFIHQCIQNYTPGAHLTIDEQLLGFRGRCPFRVYIPNKPSKYGIKILMMCDSGTKYMINGMPYLGRGTQTNGVPLGEYYVKELSKPVHGSCRNITCDNWFTSIPLAKNLLQEPYKLTIVGTVRSNKREIPEVLKNSRSRPVGTSMFCFDGPLTLVSYKPKPAKMVYLLSSCDEDASINESTGKPQMVMYYNQTKGGVDTLDQMCSVMTCSRKTNRWPMALLYGMINIACINSFIIYSHNVSSKGEKVQSRKKFMRNLYMGLTSSFMRKRLEAPTLKRYLRDNISNILPKEVPGTSDDSTEEPVMKKRTYCTYCPSKIRRKASASCKKCKKVICREHNIDMCQSCF

When a transposon is used for cassette integration in the host cellthere are additional benefits. Firstly, transposons can integrate inseveral copies in host cells, thereby providing for multi-copyexpression cassettes to support high level POI expression. This can befurther promoted using a hyperactive transposase enzyme. Additionally,transposons can integrate preferentially at sites that are active fortranscription, thereby also favouring high level and efficient POIexpression as demonstrated by mapping analysis of integration sites.This is seen, for example, with piggyBac (see Wang W., et al.,“Chromosomal transposition of PiggyBac in mouse embryonic stem cells”,2008, PNAS USA, 105(27):9290-9295; Galvan D. L., et al., “Genome-widemapping of PiggyBac transposon integration in primary human T cells”, J.Immunother., 2009, 32(8): 837-844; and Yang W., et al., “Development ofa database system for mapping insertional mutations onto the mousegenome with large-scale experimental data”, 2009, BMC genomics, 10(Suppl 3):S7).

Thus, the present invention provides the following concepts:—

-   1. A method of producing cells encoding a repertoire of proteins of    interest (POI), the method comprising:—    -   a) Providing a population of cells expressing a repertoire of        POIs;    -   b) Sorting the population of cells to produce a sorted        population of single cells, each cell comprising nucleic acid        encoding a respective POI;    -   c) Amplifying the nucleic acid comprised by the sorted single        cell population to produce a sorted repertoire of amplified        nucleic acids encoding POIs;    -   d) Modifying sorted amplified POI-encoding nucleic acids from        step (c) to produce a sorted repertoire of expression cassettes,        each cassette comprising a nucleotide sequence encoding a POI        and one or more regulatory elements for expressing the POI; and    -   e) Transferring POI expression cassettes from said cassette        repertoire to a sorted population of host cells while        maintaining the POI expression cassette sorting and producing a        sorted repertoire of host cells that expresses a sorted        repertoire of POIs.

Steps (c) and (d) can be carried out separately (in any order, e.g., (c)then (d)) or simultaneously.

Optionally, step (c) comprises PCR amplification of POI-encodingsequences, e.g., RT-PCR using POI-encoding mRNAs as template (e.g., one,two, three or more primers comprising or consisting of a sequenceselected from the group consisting of SEQ ID NOs: 1-53, as discussedfurther below). Optionally, the repertoire of amplified nucleic acids instep (c) are DNAs. In an example, each POI is an antibody variabledomain and step (c) comprises PCR amplification of POI-encodingsequences using one or more V region-specific 5′ primers and/or one ormore C region 3′ primers (e.g., CY, e.g., mouse CY primer). Optionally,the PCR comprises 5′- and/or 3′-RACE of POI-encoding nucleotidesequences. In an example, the 5′-RACE is carried out using one or more5′ primers each homologous to a 5′ UTR or promoter sequence of anantibody variable region. In an example, the 3′-RACE is carried outusing one or more 5′ primers each homologous to an antibody constantregion, e.g., a CH1 or Fc sequence. In this instance, each amplifiedPOI-encoding sequence encodes an antibody chain comprising an antibodyvariable domain and constant region. In one example, the 3′-RACE usesone or more human constant region sequences as a primer; this thenproduces sequences encoding humanised variable regions in which eachvariable region is fused to a human constant region (e.g., a human gammaCH1 or Fc (e.g., gamma Fc)), thereby providing a human antibody chain(POI) upon subsequent expression. This humanisation during step (c) isuseful since POIs identified from later screening represent human chainsthat can be used to produce antibody therapeutics for human use. In anexample, the 5′-RACE uses a 5′ template comprising a variable regionpromoter for producing amplified nucleic acids comprising (in 5′ to 3′direction): a promoter and a nucleotide sequence encoding a POI.Additionally or alternatively, 3′-RACE is used, wherein the RACE uses a3′ template comprising a polyA sequence for producing amplified nucleicacids comprising (in 5′ to 3′ direction): a nucleotide sequence encodinga POI and a polyA. In this case, amplification and modification toproduce expression cassettes can be carried out simultaneously (i.e.,steps (c) and (d) are carried out simultaneously).

In an example, step (d) modification is carried out using PCR, e.g.,bridge PCR. For example, step (d) is carried out after step (c), e.g.,after RT-PCR or RACE amplification. In this case, bridge PCR is carriedout in a step comprising hybridising a first primer to the 5′ end of thenucleic acid products of step (c); and hybridising a second primer tothe 3′ end of the nucleic acid products of step (c) (or to the 3′ end ofthe nucleic acid product of the hybridisation step using the firstprimer). Alternatively, the second primer can be used initially (tohybridise to the product of step (c)) and the product of that can behybridised with the first primer. Alternatively, the first and secondprimers and product of step (c) can be mixed together and PCR carriedout. The result in any case is an extended product comprising (in 5′ to3′ order):

[5′ sequence of the first primer]-[a promoter]-[a nucleotide sequenceencoding a POI]-[a poly A]-[a 3′ sequence of the second primer]

In one embodiment, the promoter and polyA are combined with thePOI-encoding nucleotide sequence by step (c), as described above (e.g.,using 5′- and 3′-RACE with appropriate primers). In another embodiment,the first primer used in step (d) comprises a promoter sequence (e.g.,such a sequence at the 3′ end of the primer). The result of step (d)then combines the promoter with the POI-encoding nucleotide sequence.Additionally or alternatively, the second primer used in step (d)comprises a polyA sequence (e.g., such a sequence at the 5′ end of theprimer). The result of step (d) then combines the polyA with thePOI-encoding nucleotide sequence. Other combinations are possible, e.g.,the promoter is added in step (c) using the appropriate primer and thepolyA is added in step (d) using the appropriate second primer. In anexample, step (c) (e.g., RACE) adds 5′ and/or 3′ sequences in theproduct nucleic acids that can be used for hybridisation with primers ina step (d) wherein the latter uses bridge PCR.

The result of steps (c) and (d) is always a repertoire of expressioncassettes for expression of a repertoire of POIs. In an example, one ormore regulatory elements required or desired for expression (or optimalexpression) is omitted from each cassette, but is instead provided bythe host cell genomes once the cassettes have been introduced into thehost cells.

In an embodiment, step (d) adds 5′ and 3′ integration sequences flankingthe promoter and polyA respectively. For example, the 5′ sequence is a5′ transposon element (e.g., a 5′ PB terminal element) and the 3′sequence is a 3′ transposon element (e.g., a 3′ PB terminal element thatis in inverted orientation with respect to the 5′ element). For example,the 5′ integration sequence is provided at the 5′ terminus of the firstbridge PCR primer and/or the 3′ integration sequence is provided at the3′ terminus of the second bridge PCR primer. The result is a repertoireof expression cassettes, each terminating (5′ and 3′) in tranposonelements, i.e., each cassette is modified to form a tranposon.Optionally, each expression cassette is produced as a linear DNAterminating at the 5′ and 3′ end by an integration sequence (e.g., atransposon element, e.g., terminating in inverted terminal PB transposonelements). Alternative integration sequences can be used instead oftransposon elements (see further below); for example, the integrationsequences can be homology arms (e.g., at least 15, 20, 50 or 100contiguous nucleotides) for carrying out homologous integration intorecipient host cell genomes at one or more specific target sites (thathybridise with the homology arms). Alternatively, the integrationsequences can be site-specific recombination sequences (e.g., lox, roxor frt) for site-specific integration into host cell genomes carryingcorresponding site-specific recombination sites at one or more desiredintegration sites in the genome (upon provision or expression of therespective integrase (i.e., cre, dre or flp respectively). In anembodiment, RMCE is used to insert using two incompatible recombinationsites (e.g., wild type loxP and a mutant lox, e.g., lox2272 or lox511).

In one embodiment, one or more regulatory elements for POI expression(e.g., a promoter and/or and enhancer and/or a polyA and/or a signalsequence) is added by step (c), e.g. using RACE. Additionally oralternatively, one or more regulatory elements for POI expression (e.g.,a promoter and/or and enhancer and/or a polyA and/or a signal sequence)is added by step (d), e.g. using bridge PCR.

Additionally or alternatively, in one embodiment, a 5′ and/or a 3′integration sequence is added by step (c), e.g. using RACE. Additionallyor alternatively, a 5′ and/or a 3′ integration sequence is added by step(d), e.g. using bridge PCR.

In an embodiment, one or more regulatory elements for POI expression(e.g., a promoter and/or and enhancer and/or a polyA and/or a signalsequence) and a 5′ and/or a 3′ integration sequence are added by step(c), e.g. using RACE.

In an embodiment, one or more regulatory elements for POI expression(e.g., a promoter and/or and enhancer and/or a polyA and/or a signalsequence) and a 5′ and/or a 3′ integration sequence are added by step(d), e.g. using bridge PCR.

-   2. The method of concept 1, wherein in step (e) the sorted    expression cassettes are batch transferred to the sorted host cells.

Batch transferral according to this embodiment of the invention issuperior to the prior art methods of using molecular cloning to transferPOI-encoding nucleotide sequences into host cells. As explained above,the latter requires laborious and time-consuming sequencing, analysisand subcloning of individual POI-encoding sequences that have beenmodified by the inclusion of terminal restriction sites (to enableintroduction into host cells in subsequent steps). Typically, once acorrect PCRd POI-encoding sequence with restriction sites has beenconfirmed, this is then selected as an individual sequence to takeforward for introduction into host cells, the latter then being grown upto produce a population of cells expressing the chosen POI sequence.This process of multi-step, laborious molecular cloning is performed foreach POI variant in a repertoire that is to be included in subsequentscreening. Consequently, prior art screening methods can take severalweeks (typically on the order of 6-8 weeks) to perform for a usefulrepertoire of starting cells, such as antibody producing cells. Themethod of the invention, in contrast, that uses sorted batch transferralof entire expression cassettes for POIs (i.e., including POI andregulatory elements for expression) provides a much faster technique forproducing a sorted repertoire of POIs for screening. This makes thepresent method amenable to high-throughput automation of screening. Forexample, the present inventors—performing manual operation of themethod—have been able to perform production of a sorted host cellrepertoire for POI expression in only 2 days using 4×96-well plates(approximately 186 input B-cells). Screening of the expressed POIrepertoire can be performed manually in around 2 days only. Clearly,automation speeds this up even more (and advantageously minimisescross-contamination between sorted aliquots).

For batch transferral of expression cassettes, a plurality of sortedexpression cassettes are mixed in the same operation (e.g., a singlecassette aspiration and delivery step, i.e., a single pipetting step)with the host cells for transferral into the cells (e.g., by subsequentor simultaneous transfection into the cells). That operation is, forexample, a single transferral by pipette (e.g., using a multi-channelpipette, e.g., using a 4, 8, 12, 16, 64, 96, 384 or 1536 channel pipettein a single operation). In an example, at least 4 sorted aliquots ofexpression cassettes are mixed with sorted host cells in a singleoperation so that each expression cassette aliquot is mixed with arespective cell aliquot (e.g., in a respective container, e.g., a in awell of a plate or a tube in a rack). In an example, at least 4, 8, 12,16, 24, 32, 40, 48, 56, 64, 96, 384 or 1535 sorted aliquots ofexpression cassettes are mixed in the same operation with sorted hostcells so that each expression cassette aliquot is mixed with arespective cell aliquot (e.g., in a respective container, e.g., a in awell of a plate or a tube in a rack). In one embodiment, the operationis a manual operation (e.g., by pipetting using a multi-channelpipette). In another embodiment, the operation is automated, e.g.,performed by an automated liquid handling apparatus (e.g., a liquidhandling robot).

Advantageously, the inventors have found it possible to batch transferthe expression cassettes from step (d) to the sorted host cells withoutthe need to purify the cassettes, but still yielding a useful host cellrepertoire in step (e). This provides for higher processing speeds andthroughput and makes the process amenable to simpler automation.

-   3. The method of concept 1 or 2, wherein (i) the sorted repertoire    of expression cassettes produced by step (d) is provided in a    plurality of containers whose locations relative to each other are    fixed (e.g., wells on a plate or tubes in a rack), wherein each    container contains a respective type of expression cassette such    that the relative location of expression cassettes relative to each    other is predetermined; and (ii) the expression cassettes are    transferred to the sorted host cells in step (e) such that the    relative locations of the expression cassettes is maintained.

In an example each container contains cassettes encoding a first type ofPOIs derived from a respective single cell provided in step (a) and alsocassettes encoding a second type of POIs derived from said single cell.For example, each container contains cassettes encoding V_(H) and V_(L)sequences derived from a single cell. In this way the variable domainsmaking up respective binding sites of input cells (where these encodeantibody binding sites, e.g., plasmablasts or plasma cells) are kepttogether in the sorted repertoire but not mixed with sequences derivedfrom another cell. This sorting is, thus, advantageously maintained inlater steps of the process and is traceable in the result of thesubsequent screening.

-   4. The method of any preceding concept, wherein (i) the repertoire    of expression cassettes produced by step (d) is sorted by providing    a plurality of containers (e.g., wells on a plate or tubes in a    rack), wherein each container comprises POI-encoding sequences of a    single cell sorted in step (b); (ii) the sorted host cells of    step (e) are provided in a plurality of containers (e.g., wells on a    plate or tubes in a rack) and (iii) the expression cassettes are    transferred to the sorted host cells in step (e) such that host    cells in each respective container are mixed only with POI-encoding    sequences derived from a single cell sorted in step (b).-   5. The method of concept 3 or 4, wherein in step (d) the sorted    expression cassettes are provided in a plurality of containers whose    locations relative to each other are fixed (e.g., a plurality of    containers with an arrangement of X containers by Y containers,    e.g., a 8×12 well plate (96-well plate) or a plate having a multiple    of a 8×12 container arrangement (e.g., a 384 well plate)); and in    step (e) the sorted host cells are provided in a plurality of    containers whose locations relative to each other are fixed and    comprise the same arrangement as the containers used in step (d)    (e.g., the repertoires of steps (d) and (e) are both provided on    96-well or 384-well plates of the same or substantially the same    dimension) so that sorting is maintained in step (e).-   6. The method any preceding concept, wherein the sorted repertoire    of host cells are capable of stably expressing the repertoire of    POIs.

Alternatively, sorted repertoire of host cells are capable of transientexpression. Stable expression (e.g., as a result of genomic integrationof cassettes in host cell genomes) is advantageous for longer-termsupply of cells—and thus expressed POIs—identified after screening (andalso for cells while waiting to carry out screening if the host cellsproduced in step (e) are stored for a while before being used forscreening).

-   7. The method of any preceding concept, wherein the sorted    repertoire of POI-expressing host cells produced by step (e) is    provided in a plurality of containers (e.g., tubes or wells),    wherein each container contains POIs of a single cell sorted in step    (b).

Such tubes may be fixed in a rack or holder; such wells may be fixed byprovision on one or more plates.

-   8. The method of concept 7, wherein each host cell expresses first    and second POIs, wherein the POIs are different, e.g., subunits of a    protein, e.g., variable domains of an antibody or T-cell receptor    binding site.-   9. The method of any preceding concept, wherein step (b) comprises    sorting single cells into respective containers (e.g., respective    wells on one or more plates) and carrying out steps (c) and (d) in    said containers while maintaining the sorting.-   10. The method of any preceding concept, further comprising    screening the sorted POI repertoire to identify a POI with a desired    characteristic (e.g., binding to an antigen or antibody; or binding    affinity for a cognate ligand or antigen) and/or a nucleotide    sequence encoding the identified POI (e.g., DNA or RNA, e.g., mRNA    or cDNA).-   11. The method of concept 10, further comprising identifying,    amplifying or synthesizing the nucleotide sequence encoding the    identified POI (e.g., using PCR or by culturing a selected host cell    or derivative cell thereof); and optionally producing isolated POI    using said identified, amplified or synthesized nucleotide sequence.-   12. The method of any preceding concept, further comprising    screening the sorted POI repertoire to identify a POI with a desired    characteristic (e.g., binding to an antigen or antibody; or binding    affinity for a cognate ligand or antigen) and isolating a host cell    expressing the identified POI; and optionally propagating the cell    to produce a cell line expressing the POI.-   13. The method of any preceding concept, wherein step (e) comprises    genomically-integrating (e.g., chromosomally integrating) POI    expression cassettes into respective host cell genomes for    expressing the respective POIs. Alternatively, one or more of the    cassettes is provided episomally in its respective host cell for    transient POI expression.-   14. The method of concept 13, wherein said genomic integration is    carried out using a predetermined genomic nucleotide sequence motif    for insertion of the expression cassettes into the respective cell    genome.

For example, the motif is a nucleotide sequence used by a transposon forintegration (e.g., the TTAA motif used by PB); or a nucleotide sequencethat can recombine with cassette 5′ and 3′ integration sequences byhomologous recombination; or a motif used to integrate a site-specificrecombination site.

-   15. The method of concept 14, wherein each cell genome comprises    more than one copy of the sequence motif.-   16. The method of concept 13, 14 or 15, wherein said genomic    integration is carried out by transposon-mediated integration.

Suitable transposon elements for use as 5′ and 3′ integration sequencesof cassettes are class II transposon elements (e.g., piggyBac transposoninverted terminal repeat elements or Mariner transposon elements), orsleeping beauty transposon elements or Tc1-like elements (TLEs).

-   17. The method of any one of concepts 13 to 16, wherein step (e)    comprises multiple insertions of expression cassettes into    respective host cell genomes.

Each cassette is optionally provided as part of linear nucleic acid(e.g., linear DNA). For example, each cassette is a linear transposoncomprising or consisting of 5′- and 3′-terminal transposon elements(e.g., piggyBac inverted terminal repeat elements) with POI nucleotidesequence and regulatory element(s) for expression between the transposonelements. In an embodiment, there is further sequence 5′ of the 5′transposon element and/or 3′ of the 3′ transposon element; in anotherembodiment these elements are at the 5′ and 3′ termini of the cassetterespectively.

-   18. The method of any preceding concept, wherein the host cells are    cells of a mammalian cell line (e.g., CHO or HEK293 cells) or yeast    cells.

For example, wherein each host cell is a mammalian (e.g., human ornon-human animal, plant or insect or rodent or mouse or rat or rabbit orchicken or Camelid or fish cell), bacterial or yeast cell.

-   19. The method of any preceding concept, wherein in step (a) the    cells are cells isolated from one or more animals.

Optionally, each cell of step (a) is a mammalian (e.g., human ornon-human animal, plant or insect or rodent or mouse or rat or rabbit orchicken or Camelid or fish cell), bacterial or yeast cell.

Optionally, all cells of step (a) are cells of the same type of tissueor compartment of an organism(s). For example, they are all liver,kidney, heart, brain, blood, lymphocyte, prostate, ovary or germinalcells of one or more organisms, e.g., a human patient or a non-humananimal, or a rodent or mouse or rat or rabbit or chicken or Camelid orfish.

-   20. The method of any preceding concept, wherein in step (a) the    cells comprise or consist of B-cells, germinal centre cells, memory    B-cells, antibody-secreting cells, plasma cells or plasmablast    cells.-   21. The method of any preceding concept, wherein each POI is an    immunoglobulin (e.g., antibody or T-cell receptor) chain or part    thereof (e.g., a variable domain).-   22. The method of any preceding concept, wherein each POI comprises    or consists of an antibody variable domain (e.g., a VH, VHH or VL    domain or a dAb or Nanobody™)-   23. The method of any preceding concept, wherein each cell of    step (a) expresses first and second POIs, wherein the POIs are    different from each other; wherein step (b) comprises sorting single    cells into respective containers (e.g., respective wells on one or    more plates) and carrying out steps (c) and (d) in said containers,    wherein after step (c) each container comprises amplified first POIs    mixed with amplified POIs from the same cell; and wherein step (e)    comprises mixing respective first and second POI-encoding nucleic    acids from a respective container with host cells; wherein first and    second POIs from the same cell of step (a) are transferred to the    same host cell for expression of first and second POIs by the host    cell, thereby producing a repertoire of sorted host cells each    co-expressing respective first and second POIs.-   24. The method of concept 23, wherein the first and second POIs from    the same cell are cognate polypeptides that together form a    functional protein (e.g., V_(H) and V_(L) domains that form an    antigen binding site).-   25. The method of concept 24, wherein the first and second POIs    comprise or consist of antibody V_(H) and V_(L) domains    respectively, e.g., the first and second POIs are cognate antibody    heavy and light chains respectively.-   26. The method of any preceding concept, wherein step (b) comprises    binding POIs expressed by cells to a cognate ligand (e.g., binding    antibody binding sites expressed by cells to an antigen of    interest); optionally wherein the ligand binds cell    surface-expressed POI; and    -   further sorting and isolating cells that express POIs that bind        the ligand, thereby producing the sorted population of cells.-   27. The method of concept 26, wherein FACS cell sorting is used;    optionally fluorescence FACS.-   28. The method of concept 26 or 27, wherein each sorted cell of    step (b) is provided in a respective container (e.g., a well on a    plate), such that each such container (e.g., well) comprises no more    than one cell type.-   29. The method of concept 28, wherein the sorted cell population is    provided in wells on one or more plates comprising in total less    than 5, 4, 3, 2, 1 or 0.5% wells containing more than one cell    and/or in total less than 5, 4, 3, 2, 1 or 0.5% wells containing no    cell.-   30. The method of any preceding concept, wherein step (c) is    performed using PCR, e.g., RT-PCR using POI-encoding RNA (e.g.,    mRNA).-   31. The method of any preceding concept, wherein step (d) is    performed using PCR, e.g., bridge PCR.-   32. The method of any preceding concept, wherein step (d) comprises    modification of amplified nucleic acids by combination with a    predetermined sequence so that said predetermined sequence flanks 5′    and/or 3′ of POI-encoding nucleotide sequences of the nucleic acids;    optionally wherein the modification places a regulatory element    (e.g., a 5′ promoter and/or a 3′ polyA) and/or a transposon element    (e.g., a piggyBac transposon element) flanking 5′ and/or 3′ of    POI-encoding nucleotide sequences.-   33. The method of concept 32, wherein each POI comprises an antibody    variable domain and step (c) or (d) combines POI-encoding nucleotide    sequences with an antibody constant region (optionally a human    constant region or one of a species that is different to the species    of C region comprised by POIs in the cells of step (a)) to produce a    nucleotide sequence encoding an antibody chain (optionally a    humanised chain).

For example, the cells of step (a) encode POIs comprising non-humanvertebrate (e.g., rodent, e.g., mouse or rat) constant regions and theseare replaced by human constant regions by step (c) or (d). For examplethe POIs are antibody chains (e.g., heavy chains) comprising a humanvariable domain and a rodent (e.g., mouse or rat) constant region thatis humanised by the method of the invention. This is convenient since itprovides a high-throughput way to humanise antibody chains andantibodies at scale and enables subsequent selection, production andcell and expression vector development in the context of final humanantibody/chain formats suitable for human therapeutic drug use. Priorart techniques do not do this.

-   34. A method according to concept 33, further comprising screening    the sorted POI repertoire to identify a host cell expressing an    antibody chain with a desired characteristic (e.g., specific antigen    binding or antigen binding affinity), identifying the antibody    chain-encoding nucleotide sequence of the host cell, using the    antibody chain-encoding nucleotide sequence to produce copies of the    identified antibody chain, and formulating the copies as a    pharmaceutical composition (optionally in combination with one or    more further drugs, excipients, diluents or carriers) for human    medical therapy; and optionally administering the composition to a    human patient for medical therapy of the patient.-   35. The method of any preceding concept, wherein step (e) is    automated; optionally wherein one or all of steps (b) to (c) are    also automated.

Automation may include control of the process by a computer programmedto carry out the method of any aspect, configuration, embodiment orexample of the invention.

-   36. The method of any preceding concept, wherein (i) steps (b)    to (e) inclusive are carried out in equivalent of at least 180 cells    (provided in step (a) processed in no more than 1 or 2 days;    and/or (ii) the repertoire of expressed POIs is screened for a    desired characteristic and one or more corresponding host cells or    POI-encoding nucleotide sequences are identified in an equivalent of    no more than 4 days.

The inventors have achieved this using approximately 400 input B-cellsand screening for antigen-specific antibodies, with (a) taking 2 daysand (b) taking 3 days—all done manually. Clearly, this would be evenfaster if automation is used. Thus, the invention provides significanttime saving over state of the art techniques that typically take 6-8weeks to perform such screening.

-   37. An automated apparatus for performing the method of any    preceding concept, the apparatus comprising    -   a. Means for holding a sorted single cell population in a        plurality of containers (e.g., wells on one or more plates, or        tubes in a rack or holder as described above) wherein each        single cell is in a respective container, each cell comprising        nucleic acid encoding a respective POI;    -   b. Means for delivering PCR reagents to the containers for        amplifying nucleic acid comprised by the sorted single cell        population to produce a sorted repertoire of amplified nucleic        acids encoding POIs;    -   c. Means for delivering to the containers reagents for modifying        sorted amplified POI-encoding nucleic acids to produce a sorted        repertoire of expression cassettes, each cassette comprising a        nucleotide sequence encoding a POI and one or more regulatory        elements for expressing the POI;    -   d. Means for holding a sorted population of host cells in a        plurality of containers;    -   e. Means for transferring (optionally batch transferring) POI        expression cassettes from said cassette repertoire to the sorted        population of host cells in the containers while maintaining the        POI expression cassette sorting; and    -   f. Means for carrying out introduction (e.g., transfection) of        expression cassettes into host cells in the containers to        produce a sorted repertoire of host cells that expresses a        sorted repertoire of POIs; and    -   g. Optionally a computer programmed to carry out the method of        any aspect, configuration, embodiment or example of the        invention.-   38. The apparatus of concept 37, further comprising means (e.g.,    means for performing FACS) for sorting a population of cells to    produce the sorted population of single cells.-   39. The apparatus of concept 37 or 38, further comprising means for    controlling operation of the apparatus for automated performance of    the method of any one of concepts 1 to 36.-   40. A kit for carrying out the method of any one of concepts 1 to    36, the kit comprising an apparatus according to concept 37, 38 or    39 together with nucleic acid comprising transposon element(s) for    performing the method of concept 32.

The transposon elements can be carried by, e.g., linear DNA. In anexample, the elements are elements of a transposon that mediates DNAintegration by a cut-and-paste transposition mechanism (e.g., Class IItransposon). In an example, the elements are PB or Mariner-like elementsor Tc-1-like elements (TLEs).

-   41. An expression cassette for expression of a POI in a host cell,    the cassette being provided by linear nucleic acid (e.g., linear    DNA) comprising a transposon, the transposon comprising 5′- and    3′-terminal transposon elements with a POI-encoding nucleotide    sequence and regulatory element(s) for POI expression between the    transposon elements.

Such cassettes are useful for genomically-integrating expressible POIsequences into host cells, e.g., for producing a cell line to provide aPOI source and/or for use in the screening method of the invention. Thetransposon elements can be any such elements disclosed herein.

In an example, the cassette comprises or consists of a transposoncomprising 5′- and 3′-terminal transposon elements (e.g., piggyBacinverted terminal repeat elements) with a POI-encoding nucleotidesequence and one or more regulatory element(s) for expression betweenthe transposon elements. In an embodiment, the cassette comprises afurther sequence 5′ of the 5′ transposon element and/or 3′ of the 3′transposon element. In an example, the cassette comprises an additionalnucleotide sequence corresponding to a nucleotide sequence of the hostcell genome, said additional sequence being 5′ and/or 3′ of thePOI-encoding nucleotide sequence. For example, the additional sequencecorresponds to genomic host cell sequence that is actively transcribedin the host. Thus, the POI-encoding sequence is inserted into the hostin an environment suited to active transcription of the POI sequence.

In an example, a “population” (e.g., a population of cells or cassettes)or “repertoire” as used herein comprises at least 10, 100, 1000, 10⁴,10⁵ or 10⁶ members.

-   42. A population of expression cassettes according to concept 41,    wherein the population encodes a repertoire of POIs.-   43. A sorted population of expression cassettes according to concept    42.-   44. A sorted population of expression cassettes encoding a    repertoire of POIs corresponding to POIs expressed by a population    of cells, each cassette comprising a nucleotide sequence encoding a    member of the repertoire of POIs and one or more regulatory elements    for POI expression (when in a host cell), wherein each said cassette    comprises the arrangement (in 5′ to 3′ direction): transposon    element-[POI nucleotide sequence & regulatory element(s)]-transposon    element, and expression cassettes for expression of POIs    corresponding to POIs of different cells are isolated from each    other in the sorted population (e.g., in different wells of a plate,    e.g., one cassette species per well on one or more plates).

In an example, each cassette is capable of expressing a POI of (derivedfrom) a single cell, e.g., an antibody heavy or light chain or fragmentthereof derived from a single B-cell.

In an example, piggyBac elements are used.

-   45. The population of concept 44, wherein each expression cassette    is provided by a linear DNA.-   46. The cassette population of any one of concepts 42 to 45, wherein    each cassette is in a host cell.-   47. A sorted population of host cells comprising the sorted    population of expression cassettes according to any one of concepts    43, 44 and 45 for expression of a sorted repertoire of POIs.-   48. A method of making a transposon comprising a nucleotide sequence    of interest (NOI), the method comprising    -   a. Providing a first nucleotide sequence (e.g., provided by DNA        or RNA) comprising (in 5′ to 3′ direction) A, B and C        (optionally consisting of the structure 5′-A-B-C-3′), wherein A        is a first homology sequence, B is a nucleotide sequence        comprising (or consisting of) the NOI and C is a second homology        sequence;    -   b. Providing a first template nucleotide sequence comprising (or        consisting of) (in 5′ to 3′ direction) W and X, wherein W is a        nucleotide sequence comprising (or consisting of) a first        transposon element (e.g., a piggyBac terminal repeat element)        and X is a third homology sequence; and    -   c. Providing a second template nucleotide sequence comprising        (or consisting of) (in 5′ to 3′ direction) Y and Z, wherein Y is        a fourth homology sequence and Z is a nucleotide sequence        comprising (or consisting of) a second transposon element (e.g.,        a piggyBac terminal repeat element); and either    -   d. (i) Mixing the first nucleotide sequence with the first        template to hybridise the first and third homology arms together        and carrying out nucleic acid amplification and extension (e.g.,        using PCR) to extend the first nucleotide sequence using the        first template to produce a first extended nucleotide sequence        (first ENS) comprising (in 5′ to 3′ direction) W, B and C;        and (ii) mixing the first ENS with the second template to        hybridise the second and fourth homology arms together and        carrying out nucleic acid amplification and extension to extend        the first ENS to produce a second ENS comprising (or consisting        of) (in 5′ to 3′ direction) W, B and Z; or        -   (ii) Mixing the first nucleotide sequence with the second            template to hybridise the second and fourth homology arms            together and carrying out nucleic acid amplification and            extension (e.g., using PCR) to extend the first nucleotide            sequence using the second template to produce a third            extended nucleotide sequence (third ENS) comprising (in 5′            to 3′ direction) A, B and Z; and (ii) mixing the third ENS            with the first template to hybridise the first and third            homology arms together and carrying out nucleic acid            amplification and extension to extend the third ENS to            produce a fourth ENS comprising (or consisting of) (in 5′ to            3′ direction) W, B and Z; or        -   (iii) Mixing the first nucleotide sequence with the first            and second templates to hybridise the first and third            homology arms together and to hybridise the second and            fourth homology arms together and carrying out nucleic acid            amplification and extension (e.g., using PCR) to extend the            first nucleotide sequence using the second template to            produce a fifth ENS comprising (or consisting of) (in 5′ to            3′ direction) W, B and Z; and    -   e. Isolating an ENS comprising (or consisting of) (in 5′ to 3′        direction) W, B and Z, thereby producing an isolated transposon        comprising a NOI flanked by transposon elements; and    -   f. Optionally introducing the isolated transposon into a        recipient cell so that the transposon integrates into the genome        of the cell.

Optionally one, more or all of the homology sequences comprises anucleotide sequence of at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100,150, 200 or more contiguous nucleotides.

In an example, the NOI encodes a POI, protein domain or protein fragmentor is itself one or more regulatory element(s). For example the NOIencodes a POI that is an orthologue or homologue of a protein in therecipient cell genome or in a human or non-human vertebrate.

In an embodiment, W and X are at the 5′ and 3′ termini of the firsttemplate sequence respectively. Additionally or alternatively, Y and Zare at the 5′ and 3′ termini of the second template sequencerespectively. When W and X are at the 5′ and 3′ termini of the firsttemplate sequence respectively and Y and Z are at the 5′ and 3′ terminiof the second template sequence respectively, the product of the methodis a linear transposon with transposon elements at its termini which iswell suited to genomic integration to modify host cells.

Optionally the first template consists of 5′-W-X-3′. In an example,there is no intervening nucleotide sequence between W and X. In anotherembodiment, there is a further nucleotide sequence between W and X,e.g., a regulatory element or exon or other desired nucleotide sequence(e.g., protein-coding sequence) which will become combined immediatelyupstream of the NOI in the product of the method. This is useful, forexample for constructing an expression cassette for combining a promoterupstream of a NOI (where the NOI encodes a POI) for subsequentexpression of the POI once the transposon has been inserted into a hostcell genome.

Additionally or alternatively, optionally the second template nucleotidesequence consists of 5′-X-Y-3′ or there is an intervening nucleotidesequence between X and Y, e.g., a regulatory element or exon or otherdesired nucleotide sequence (e.g., protein-coding sequence) which willbecome combined immediately downstream of the NOI in the product of themethod. This is useful, for example for constructing an expressioncassette for combining a polyA downstream of a NOI (where the NOIencodes a POI) for subsequent expression of the POI once the transposonhas been inserted into a host cell genome. In another example, theintervening sequence encodes a protein that will become fused to the POIupon expression to produce a fusion product. For example, the POIcomprises or consists of an antibody variable domain and the interveningsequence comprises or consists of an antibody constant region-encodingsequence. For example, the intervening sequence encodes an antibody Fcor an antibody CH1 or CL domain. In an example, the Fc or constantregion or protein encoded by the intervening sequence is a human Fc,constant region or protein. This is useful for humanising the POI (e.g.,to produce a humanised antibody chain when the POI is a variable domain,e.g., a human variable domain).

-   49. The method of concept 48, wherein there is an intervening    nucleotide sequence between W and X and/or an intervening nucleotide    sequence between Y and Z; optionally wherein the or each intervening    sequence is a regulatory element or protein-coding sequence.-   50. The method of concept 49, wherein the NOI encodes a protein    domain (e.g., an antibody variable domain) and there is a nucleotide    sequence encoding an antibody constant region (e.g., an antibody Fc,    e.g., a human Fc) between Y and Z, whereby the transposon product    encodes a fusion protein comprising a protein domain fused to an    antibody constant region (e.g., encoding an antibody chain).-   51. The method of any one of concepts 48 to 50, wherein one or more    of the first and second homology arms is combined with the NOI by    PCR (e.g., 5′- and/or 3′-RACE) to form said first nucleotide    sequence before carrying out said extension.-   52. A method of making a repertoire of transposons, wherein members    of the repertoire encode different POIs (e.g., different antibody    variable domains), the method comprising    -   i. Providing a population of first nucleotide sequences        comprising a repertoire of NOIs; and    -   ii. For each first nucleotide sequence, carrying out the method        of any one of concepts 48 to 51, thereby producing a repertoire        of transposons encoding a repertoire of POIs.-   53. The method of concept 52, comprising sorting the first    nucleotide sequences to provide a sorted population before carrying    out step (ii), wherein a sorted repertoire of transposons is    produced encoding a sorted repertoire of POIs.-   54. The method of concept 52, wherein the transposons of said    repertoire of transposons are mixed together.-   55. The method of any one of concepts 52 to 54, wherein transposons    of the repertoire are introduced into recipient cells so that    transposons integrate into the genome of cells, each integrated    transposon comprising a POI expression cassette flanked by    transposon elements, the cassette comprising a NOI and one or more    regulatory elements for expression of the POI in a host cell.-   56. The method of concept 55 when dependent from concept 53, wherein    the sorting is maintained when the transposons are introduced into    the cells, thereby producing a sorted repertoire of cells expressing    a sorted repertoire of POIs.-   57. A method of producing a host cell for expression of a POI, the    method comprising    -   a. Providing at least first and second expression cassettes,        wherein each expression cassette comprises        -   i. a first integration element and a second integration            element 3′ of the first integration element nucleotide            sequence; and        -   ii. between the integration elements a nucleotide sequence            encoding a POI and one or more regulatory elements for            expressing the POI;        -   iii. wherein the integration elements are capable of            insertion into a nucleic acid by recognition of a            predetermined nucleotide sequence motif of the nucleic acid            using an integration enzyme;    -   b. Providing a host cell whose genome comprises a plurality of        said motifs; and    -   c. Simultaneously or sequentially introducing the first and        second expression cassettes into the host cell, wherein each        cassette is genomically-integrated in the host cell genome at a        said motif for expression of POIs by the host cell; and    -   d. Optionally producing a cell line expression POIs in a step        comprising culturing the host cell.

This aspect of the invention is useful for producing host cells and celllines for relatively high expression of one or more POIs of interest.Genomic integration of POI cassettes at multiple genomic sites providesfor stable expression and there is also the possibility to targettranscriptionally-active regions of the host genome. Use of sequencemotifs guides the insertion to useful sites and this is preferable torandom integration of sequences as used in the art.

-   58. The method of concept 57, wherein the first and second    integration elements of the first cassette are identical to the    first and second integration elements respectively of the second    cassette.

In an example, each cassette comprises first and second transposonelements, e.g., elements of the same type of transposon (e.g., PB or aClass II transposon). In an example, all elements are site-specificrecombination sites, e.g., lox sites or frt sites or a mixture of these.In another example, all elements are homology arms (contiguousnucleotide sequences sufficient for homologous recombination in the hostcell). In an example, the site-specific recombination sites are the sameor they are different (e.g., mutually incompatible sites (e.g., loxP andlox511 or 2272) for carrying out RMCE (recombinase-mediated cassetteexchange) for directed insertion of the cassette into the genome.

-   59. The method of concept 57 or 58, wherein the first and second    integration elements of each of said first and second cassettes are    in mutually inverted orientation (e.g., inverted PB transposon    elements or inverted site-specific recombination sites).-   60. The method of any one of concepts 57 to 59, wherein one or more    of said motifs is engineered into the chromosome of the host cell    prior to carrying out step (c), e.g., lox site pairs are engineered    into one or more host chromosomes, wherein pairs corresponding to    lox pairs in cassettes are used).-   61. The method of any one of concepts 57 to 59, wherein one or more    of said motifs is endogenous to the host cell genome; optionally    wherein each of said motifs at which a cassette is integrated is an    endogenous motif.

For example, transposons recognise endogenous motifs (e.g., PBrecognises TTAA in genomes).

-   62. The method of any one of concepts 57 to 61, wherein at least 3    cassettes are genomically-integrated into the host cell genome,    e.g., into one or more host chromosomes—which is useful for stable    expression.-   63. The method of any one of concepts 57 to 62, wherein the cassette    genomic integration sites are active for transcription of the    POI-encoding sequences.

This can be achieved using transposons (e.g., PB) in the cassettes.

-   64. The method of any one of concepts 57 to 63, wherein each    cassette is integrated by homologous recombination between the    integration sites and the host genome; site-specific recombination    between the integration sites and the host genome; or by    transposon-mediated integration.-   65. The method of concept 64, wherein the enzyme is selected from a    recombinase or a transposase (e.g., an enzyme corresponding to the    integration elements PBase (e.g., hyperactive PBase), flp or cre    recombinase).

In an example, the host cell has been engineered to express suchenzyme(s), e.g., from a genomically-integrated gene (e.g., an induciblegene). In another embodiment, the enzyme is expressed from an episomalvector. In another example, the enzyme is introduced into the host cell.

-   66. The method of any one of concepts 57 to 64, wherein each    cassette is a transposon.-   67. The method of any one of concepts 57 to 66, comprising providing    a population of host cells and carrying out the method of any one of    concepts 57 to 66 on a plurality of host cells of said population;    and optionally isolating the host cells produced by step (c) or (d).-   68. The method of any one of concepts 57 to 67, comprising isolating    a host cell produced by step (c) or (d) and identifying, amplifying    or synthesizing the nucleotide sequence encoding the POI expressible    by the cell; and optionally producing isolated POI using said    identified, amplified or synthesized nucleotide sequence or a mutant    thereof.-   69. The method of concept 68, comprising formulating the isolated    POI into a drug for human medicine; and optionally administering the    drug to a human patient.-   70. A population of host cells obtainable by the method of any one    of concepts 57 to 69, each cell comprises a plurality of    genomically-integrated expression cassettes for expressing POIs,    each host cell comprising a plurality of identical nucleotide    sequence motifs throughout its genome adjacent an integrated    expression cassette for expression of a POI from each such cassette;    each integrated cassette comprising    -   a. a first integration element sequence and a second integration        element sequence 3′ of the first integration element nucleotide        sequence; and    -   b. between the integration element sequences a nucleotide        sequence encoding a POI and one or more regulatory elements for        expressing the POI.-   71. The host cells of concept 70, wherein all POIs expressed by the    cells are the same POI.-   72. The host cells of concept 70, wherein the cells express first    and second POIs (e.g., VH and VL domains of a single antibody type;    or heavy and light chains of a single antibody type) that associate    together to form a functional protein or ligand (e.g., antigen)    binding site.-   73. An antibody or antigen binding site of an antibody for medical    treatment of a human, wherein the antibody or binding site has been    isolated from a host cell produced by a method of any one of    concepts 57 to 69 or isolated from a host cell of a population    according to any one of concepts 70 to 72.-   74. A nucleic acid mixture comprising a first isolated nucleic acid    and a second isolated nucleic acid, wherein the first nucleic acid    is capable of hybridising to a human antibody V region 5′UTR    sequence (i.e., a nucleotide sequence) of a gene comprised by a    target nucleic acid, wherein the gene encodes a human V region; and    the second nucleic acid is capable of hybridising to a second    sequence, wherein the second sequence is comprised by the target    nucleic acid and is 3′ to the UTR sequence, wherein the first    isolated nucleic acid comprises (or consists of) a sequence that is    at least 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% % identical (or    100% identical) to a sequence selected from the group consisting of    SEQ ID NOs: 1-47.

SEQ ID NOs: 1-47 comprises human variable region-specific sequences asindicated in Table 1 (more particularly specific 5′UTR nucleotidesequences of human variable regions). By “specific to” is meant thatsuch sequences can be used as a 5′ primer sequence in standard PCR(e.g., RT-PCR) of human variable region nucleic acid.

SEQ ID NOs: 1-17 comprises human heavy chain variable region-specificsequences.

SEQ ID NOs: 18-26 comprises human kappa chain variable region-specificsequences.

SEQ ID NOs: 27-47 comprises human kappa chain variable region-specificsequences.

In an embodiment, the invention provides a nucleic acid (e.g., a PCRprimer or a vector for homologous recombination) comprising (orconsisting of) at least 15 contiguous nucleotides of a Sequence denotedX in Table 2 for hybridising to the 5′UTR sequence of a gene segmentdenoted Y in Table 2, e.g., for performing PCR to copy the gene segmentor to hybridise a homologous recombination vector to the 5′UTR sequencefor modification of the gene segment. In an example, the nucleic acidcomprises (or consists of) at least 15, 16, 17, 18, 19, 20, 21, 22, 23,24 or 25 or all of sequence X. In an example, the contiguous nucleotidesend with the 3′ nucleotide of sequence X (i.e., the contiguousnucleotides extending 5′ from the 3′ end of X are used). In anembodiment, the invention provides a mixture of two or more of thenucleic acids, e.g., for PCR copying of two or more variable regionsequences (e.g., using DNA, cDNA or RNA from corresponding B-cells). Inan example, two, more or all of the nucleic acids in the mixture copyV_(H) gene segments. In an example, two, more or all of the nucleicacids in the mixture copy V_(A) gene segments. In an example, two, moreor all of the nucleic acids in the mixture copy Vκ gene segments.Optionally, the, or each, nucleic acid comprises a promoter nucleotidesequence immediately 5′ of the UTR sequence (or the 15 or morecontiguous nucleotide part). For example, the promoter sequence is a CMVpromoter sequence as follows:

(SEQ ID NO: 54) 5′-CTTACTGGCTTATCGAAATTAATACGACTCAGATC-3′

In an example, the invention provides:

A PCR primer or homologous recombination vector comprising at least 15contiguous nucleotides of a human antibody variable gene segment UTRsequence for hybridising to the 5′UTR sequence of an antibody variablegene segment denoted Y in Table 2, wherein the primer/vector sequence isselected from the group consisting of the sequences denoted X in Table2.

In an example of the nucleic acid, mixture or primer of the invention,each nucleic acid or primer hybridises with its cognate sequence at atemperature of from such as 45-70° C., (e.g., at 50° C., or 60° C., or68° C.) or 60-75° C., in a PCR reaction. A person skilled in the artwill be aware of cycle times and temperatures to carry out the PCRreaction.

Each nucleic acid of the invention and mixture of the invention isuseful for performing PCR amplification or replication of a targetnucleotide sequence encoding a human antibody variable domain or aprotein comprising such a domain, for example PCR of human variableregion-encoding nucleotide sequence(s) isolated from one or more cells(e.g., B-cells). Thus, in an embodiment, each nucleic acid is a PCRprimer.

Each nucleic acid of the invention and mixture of the invention isuseful for performing homologous recombination to modify a targetnucleotide sequence (e.g., a sequence comprised by the genome of a cell,e.g., a mammalian cell, e.g., an ES cell or CHO cell). For homologousrecombination, as is known by the skilled person, one uses a nucleicacid vector comprising a 5′ homology arm, a 3′ homology arm andoptionally a predetermined nucleotide sequence of interest therebetween.The sequence can, for example, encode a POI, a protein domain or becomprise a regulatory element. In an alternative there is no interveningsequence between the homology arms (and in this case the vector is usedto delete sequence from the genome lying between regions that hybridisewith the homology arms, as is known by the skilled person). In thepresent embodiment, the invention provides a homologous recombinationvector, wherein the vector comprises a 5′ arm comprising a 3′ homologyarm and optionally a nucleotide sequence therebetween, wherein the 5′arm comprises (or consists of) a sequence that is at least 90, 91, 92,93, 94, 95, 96, 97, 98 or 99% identical to a sequence selected from thegroup consisting of SEQ ID NOs: 1-47 (or is 100% identical) and/orwherein the 3′ arm comprises (or consists of) a sequence that is atleast 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% identical (or is 100%identical) to a sequence selected from the group consisting of SEQ IDNOs: 48 to 53. This enables gene targeting of specific V and/or C genesegments of Ig loci in a vertebrate using homologous recombination.

The invention, therefore, also provides a method of modifying an Iglocus comprised by a vertebrate cell, the method comprising introducingthe vector of the invention into the cell (e.g., by transfection) andcarrying out homologous recombination to modify the Ig locus; andoptionally expressing an antibody V domain or chain from the modifiedlocus. Optionally, the sequence of the V domain is identified or copiedor isolated from the cell and used to produce an antibody or apharmaceutical composition comprising such an antibody for human medicaluse.

The invention further provides a PCR primer that comprises (or consistsof) a sequence that is at least 90, 91, 92, 93, 94, 95, 96, 97, 98 or99% % identical to a sequence selected from the group consisting of SEQID NOs: 1-53 (or is 100% identical). For example, the primer is invitro.

The term “isolated” excludes sequences that are present in thechromosomal content of vertebrate or a vertebrate cell.

The nucleic acid, PCR primer or mixture may be provided in vitro, e.g.,mixed with a PCR buffer or reagent. In an example, a nucleic acid,primer or mixture of the invention is provided in a container, a vial, atube, a dish or a PCR cuvette.

Optionally, the first isolated nucleic acid comprises (or consists of) asequence that is at least 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99%identical to a sequence selected from the group consisting of SEQ IDNOs: 1-47. Optionally, the first isolated nucleic acid comprises (orconsists of) a sequence selected from the group consisting of SEQ IDNOs: 1-47.

-   75. The mixture of concept 74, wherein the first isolated nucleic    acid comprises a sequence that is least 90, 91, 92, 93, 94, 95, 96,    97, 98 or 99% identical to a sequence selected from the group    consisting of SEQ ID NOs: 1-17.

Optionally, the first isolated nucleic acid comprises a sequenceselected from the group consisting of SEQ ID NOs: 1-17.

-   76. The mixture of concept 74, wherein the first isolated nucleic    acid comprises a sequence that is least 90, 91, 92, 93, 94, 95, 96,    97, 98 or 99% identical to a sequence selected from the group    consisting of SEQ ID NOs: 18-26.

Optionally, the first isolated nucleic acid comprises a sequenceselected from the group consisting of SEQ ID NOs: 18-26.

-   77. The mixture of concept 74, wherein the first isolated nucleic    acid comprises a sequence that is least 90, 91, 92, 93, 94, 95, 96,    97, 98 or 99% identical to a sequence selected from the group    consisting of SEQ ID NOs: 27-47.

Optionally, the first isolated nucleic acid comprises a sequenceselected from the group consisting of SEQ ID NOs: 27-47.

-   78. The mixture of any one of concepts 74 to 77, wherein the second    isolated nucleic acid comprises an antibody constant region    sequence; optionally a sequence that is at least 90, 91, 92, 93, 94,    95, 96, 97, 98 or 99% identical to a sequence selected from the    group consisting of SEQ ID NOs: 48 to 53.

Optionally, the second isolated nucleic acid comprises a sequenceselected from the group consisting of SEQ ID NOs: 48 to 53. SEQ ID NOs:48-51 are sequences from mouse constant regions; SEQ ID NOs: 52 and 53are sequences from human constant regions (see Table 1).

-   79. The mixture of concept 75, wherein the second isolated nucleic    acid comprises an antibody heavy chain constant region sequence; and    optionally comprises a sequence that is at least 90, 91, 92, 93, 94,    95, 96, 97, 98 or 99% identical (or 100% identical) to SEQ ID NO: 48    or 49.-   80. The mixture of concept 76, wherein the second isolated nucleic    acid comprises an antibody kappa chain constant region sequence; and    optionally comprises a sequence that is at least 90, 91, 92, 93, 94,    95, 96, 97, 98 or 99% identical (or 100% identical) to SEQ ID NO: 50    or 51.-   81. The mixture of concept 77, wherein the second isolated nucleic    acid comprises an antibody lambda chain constant region sequence;    and optionally comprises a sequence that is at least 90, 91, 92, 93,    94, 95, 96, 97, 98 or 99% identical (or 100% identical) to SEQ ID    NO: 52 or 53.-   82. A nucleic acid mixture comprising a first isolated nucleic acid    and a second isolated nucleic acid, wherein the nucleic acids are    different and selected from nucleic acids comprising a sequence that    is at least 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% identical (or    100% identical) to a sequence selected from the group consisting of    SEQ ID NOs: 1-47.

In an example, the nucleic acids are PCR primers; in another embodimentthey comprise homologous recombination vectors for modifying an Ig locusor loci.

-   83. The mixture of concept 82, comprising a sequence selected from    the group consisting of SEQ ID NOs: 18-26 and a sequence that is at    least 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% identical (or 100%    identical) to a sequence selected from the group consisting of SEQ    ID NOs: 18-26 and/or selected from the group consisting of SEQ ID    NOs: 27-47.-   84. The mixture of concept 82 or 83, wherein each of the first and    second isolated nucleic acids is selected that is at least 90, 91,    92, 93, 94, 95, 96, 97, 98 or 99% identical (or 100% identical) to a    sequence selected from the group consisting of SEQ ID NOs: 1-17.-   85. The mixture of concept 84, comprising at least 3 different    isolated nucleic acids each that is at least 90, 91, 92, 93, 94, 95,    96, 97, 98 or 99% identical (or 100% identical) to a sequence    selected from the group consisting of SEQ ID NOs: 1-17.-   86. The mixture of concept 82 or 83, wherein each of the first and    second isolated nucleic acids is at least 90, 91, 92, 93, 94, 95,    96, 97, 98 or 99% identical (or 100% identical) to a sequence    selected from the group consisting of SEQ ID NOs: 18-26.-   87. The mixture of concept 84, comprising at least 3 different    isolated nucleic acids that each is at least 90, 91, 92, 93, 94, 95,    96, 97, 98 or 99% identical (or 100% identical) to a sequence    selected from the group consisting of SEQ ID NOs: 18-26.-   88. The mixture of concept 82 or 83, wherein each of the first and    second isolated nucleic acids is at least 90, 91, 92, 93, 94, 95,    96, 97, 98 or 99% identical (or 100% identical) to a sequence    selected from the group consisting of SEQ ID NOs: 27-47.-   89. The mixture of concept 84, comprising at least 3 different    isolated nucleic acids each that is at least 90, 91, 92, 93, 94, 95,    96, 97, 98 or 99% identical (or 100% identical) to a sequence    selected from the group consisting of SEQ ID NOs: 27-47.-   90. The mixture of concept 84 or 85, wherein the mixture comprises    an antibody heavy chain constant region sequence; and optionally    comprises SEQ ID NO: 48 and/or 49.-   91. The mixture of concept 86 or 87, wherein the mixture comprises    an antibody kappa chain constant region sequence; and optionally    comprises SEQ ID NO: 50 and/or 51.-   92. The mixture of concept 87 or 88, wherein the mixture comprises    an antibody lambda chain constant region sequence; and optionally    comprises SEQ ID NO: 52 and/or 53.-   93. The method of any one of concepts 1 to 36, wherein step (c) is    performed by PCR using one or more mixtures according to any one of    concepts 74 to 92.-   94. The method of any one of concepts 48 to 56, wherein the method    is performed by PCR using one or more mixtures according to any one    of concepts 74 to 92.-   95. A kit comprising one or more mixtures according to any one of    concepts 74 to 92 and an apparatus according to any one of concepts    36 to 39.-   96. A method of amplifying a repertoire of human variable region    sequences, the method comprising    -   a. Providing a population of cells expressing a repertoire of        human variable regions, wherein the cells comprise nucleotide        sequences encoding the variable regions;    -   b. Replicating a plurality of said variable region-encoding        nucleotide sequences using PCR and PCR templates; and    -   c. Isolating, sequencing or identifying one or more of the        replicated nucleotide sequences or carrying out steps (d)        and (e) of the method of any one of concepts 1 to 36; wherein        one or more templates of step (b) comprises a sequence that is        at least 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% identical (or        100% identical) to a sequence selected from the group consisting        of SEQ ID NOs: 1-52.-   97. The method of concept 96, wherein step (b) uses one or more    mixtures according to any one of concepts 74 to 92 as PCR template.-   98. The method of concept 96 or 97, wherein the cells in step (a)    are sorted single cells (e.g., sorted into wells of one or more    plates).-   99. The method of concept 96 or 97, further comprising producing a    human variable region (e.g., as part of an isolated antibody chain    or an isolated antibody for human medicine) using a replicated    sequence obtained in step (c) and optionally producing a cell line    that expresses the human variable region.

The following optional features are applicable to any configuration,aspect, embodiment or example of the invention described herein.

Optionally, the POI-encoding nucleotide sequence is operably linked to apromoter capable of driving expression of the POI, wherein the promotercomprises a eukaryotic promoter that is regulatable by an activator orinhibitor. In another embodiment, the eukaryotic promoter is operablylinked to a prokaryotic operator, and the eukaryotic cell optionallyfurther comprises a prokaryotic repressor protein.

Optionally, each expression cassette comprises a sequence encoding amarker, such as a selectable marker, e.g., a hygromycin resistance geneor encoding a fluorescent protein (e.g., the fluorescent protein isselected from DsRed, GFP, eGFP, CFP, eCFP, and YFP).

Optionally, one or more or all of the expression cassettes comprisesfirst and second POI-encoding nucleotide sequences, e.g., in tandem oras a bicistronic cassette. In an example, the encoded POIs are different(e.g., VH and VL of an antibody); in another example, they aredifferent. In an example, 1, 2, 3, 4, 5, 6 or more POI-encodingnucleotide sequences.

In an example, the or each host cell is a CHO (Chinese Hamster Ovary)cell or HEK293 cell.

For example, the protein of interest can be an antibody or fragmentthereof, a chimeric antibody or fragment thereof, an ScFv or fragmentthereof, an Fc-tagged protein or fragment thereof, a growth factor or afragment thereof, a cytokine or a fragment thereof, or an extracellulardomain of a cell surface receptor or fragment thereof.

Nucleic Acid Constructs

Recombinant expression cassettes (vectors) can comprise synthetic orcDNA-derived DNA fragments encoding a protein of interest, operablylinked to a suitable transcriptional and/or translational regulatoryelement derived from mammalian, viral or insect genes. Such regulatoryelements include transcriptional promoters, enhancers, sequencesencoding suitable mRNA ribosomal binding sites, and sequences thatcontrol the termination of transcription and translation. Mammalianexpression cassettes can also comprise non-transcribed elements such asan origin of replication, other 5′ or 3′ flanking non-transcribedsequences, and 5′ or 3′ non-translated sequences such as splice donorand acceptor sites. A selectable marker gene to facilitate recognitionof transfectants may also be incorporated.

Transcriptional and translational control sequences in expressioncassettes useful for transfecting vertebrate cells may be provided byviral sources. For example, commonly used promoters and enhancers arederived from viruses such as polyoma, adenovirus 2, simian virus 40(SV40), and human cytomegalovirus (CMV). Viral genomic promoters,control and/or signal sequences may be utilized to drive expression,provided such control sequences are compatible with the host cellchosen. Non-viral cellular promoters can also be used (e.g., thebeta-globin and the EF-1 alpha promoters), depending on the cell type inwhich the recombinant protein is to be expressed.

DNA sequences derived from the SV40 viral genome, for example, the SV40origin, early and late promoter, enhancer, splice, and polyadenylationsites may be used to provide other genetic elements useful forexpression of the heterologous DNA sequence. Early and late promotersare particularly useful because both are obtained easily from the SV40virus as a fragment that also comprises the SV40 viral origin ofreplication (Fiers et al., Nature, 1978, 273:113). Smaller or largerSV40 fragments may also be used. Typically, the approximately 250 bysequence extending from the Hind III site toward the BglI site locatedin the SV40 origin of replication is included.

Bicistronic expression vectors used for the expression of multipletranscripts have been described previously (Kim S. K. and Wold B. J.,Cell, 1985, 42:129) and can be used in combination with one or morePOI-encoding sequences.

Host Cells and Transfection

Optionally, eukaryotic host cells are used in the methods of theinvention, e.g., they are mammalian host cells, including, for example,CHO cells or mouse cells.

Expressed proteins (POIs) will preferably be secreted into the culturemedium, depending on the nucleic acid sequence selected, but may beretained in the cell or deposited in the cell membrane. Variousmammalian cell culture systems can be employed to express recombinantproteins. Examples of suitable mammalian host cell lines include theCOS-7 lines of monkey kidney cells, described by Gluzman (1981) Cell23:175, and other cell lines capable of expressing an appropriate vectorincluding, for example, CV-1/EBNA (ATCC CRL 10478), L cells, C127, 3T3,CHO, HeLa and BHK cell lines. Other cell lines developed for specificselection or amplification schemes will also be useful with the methodsand compositions provided herein. A preferred cell line is the CHO cellline designated K1. In order to achieve the goal of high volumeproduction of recombinant proteins, the host cell line is optionallypre-adapted to bioreactor medium in the appropriate case.

Several transfection protocols are known in the art, and are reviewed inKaufman (1988) Meth. Enzymology 185:537. The transfection protocolchosen will depend on the host cell type and the nature of the POI, andcan be chosen based upon routine experimentation. The basic requirementsof any such protocol are first to introduce DNA encoding the protein ofinterest into a suitable host cell, and then to identify and isolatehost cells which have incorporated the heterologous DNA in a relativelystable, expressible manner.

One commonly used method of introducing heterologous DNA into a cell iscalcium phosphate precipitation, for example, as described by Wigler etal. (Proc. Natl. Acad. Sci. USA 77:3567, 1980). DNA introduced into ahost cell by this method frequently undergoes rearrangement, making thisprocedure useful for cotransfection of independent genes.

Polyethylene-induced fusion of bacterial protoplasts with mammaliancells (Schaffner et al., (1980) Proc. Natl. Acad. Sci. USA 77:2163) isanother useful method of introducing heterologous DNA. Protoplast fusionprotocols frequently yield multiple copies of the plasmid DNA integratedinto the mammalian host cell genome, and this technique requires theselection and amplification marker to be on the same nucleic acid as thepOI.

Electroporation can also be used to introduce DNA directly into thecytoplasm of a host cell, for example, as described by Potter et al(Proc. Natl. Acad. Sci. USA 81:7161, 1988) or Shigekawa et al(BioTechniques, 6:742, 1988). Unlike protoplast fusion, electroporationdoes not require the selection marker and the POI to be on the samenucleic acid.

More recently, several reagents useful for introducing heterologous DNAinto a mammalian cell have been described. These include Lipofectin™Reagent and Lipofectamine™ Reagent (Gibco BRL, Gaithersburg, Md.). Bothof these reagents are commercially available reagents used to formlipid-nucleic acid complexes (or liposomes) which, when applied tocultured cells, facilitate uptake of the nucleic acid into the cells.

A method for amplifying the POI is also desirable for expression of therecombinant protein, and typically involves the use of a selectionmarker (reviewed in Kaufman supra). Resistance to cytotoxic drugs is thecharacteristic most frequently used as a selection marker, and can bethe result of either a dominant trait (e.g., can be used independent ofhost cell type) or a recessive trait (e.g., useful in particular hostcell types that are deficient in whatever activity is being selectedfor). Several amplifiable markers are suitable for use in the expressionvectors of the invention (e.g., as described in Maniatis, MolecularBiology: A Laboratory Manual, Cold Spring Harbor Laboratory, N Y, 1989;pgs 16.9-16.14).

Useful selectable markers for gene amplification in drug-resistantmammalian cells are shown in Table 1 of Kaufman, R. J., supra, andinclude DHFR-MTX resistance, P-glycoprotein and multiple drug resistance(MDR)-various lipophilic cytotoxic agents (e.g., adriamycin, colchicine,vincristine), and adenosine deaminase (ADA)-Xyl-A or adenosine and2′-deoxycoformycin.

Other dominant selectable markers include microbially derived antibioticresistance genes, for example neomycin, kanamycin or hygromycinresistance. However, these selection markers have not been shown to beamplifiable (Kaufman, R. J., supra). Several suitable selection systemsexist for mammalian hosts (Maniatis supra, pgs 16.9-16.15).Co-transfection protocols employing two dominant selectable markers havealso been described (Okayama and Berg, Mol. Cell Biol 5:1136, 1985).

Useful regulatory elements, described previously or known in the art,can also be included in the nucleic acid constructs used to transfectmammalian cells. The transfection protocol chosen and the elementsselected for use therein will depend on the type of host cell used.Those of skill in the art are aware of numerous different protocols andhost cells, and can select an appropriate system for expression of adesired protein, based on the requirements of the cell culture systemused.

An aspect provides a pharmaceutical composition comprising an isolatedPOI (e.g., antibody, chain or variable domain) and a diluent, excipientor carrier, optionally wherein the composition is contained in an IVcontainer (e.g., and IV bag) or a container connected to an IV syringeand wherein the POI has been isolated from a host cell of the inventionor population of host cells.

An aspect provides the use of the POI of the invention in themanufacture of a medicament for the treatment and/or prophylaxis of adisease or condition in a patient, e.g. a human

It will be understood that particular embodiments described herein areshown by way of illustration and not as limitations of the invention.The principal features of this invention can be employed in variousembodiments without departing from the scope of the invention. Thoseskilled in the art will recognize, or be able to ascertain using no morethan routine study, numerous equivalents to the specific proceduresdescribed herein. Such equivalents are considered to be within the scopeof this invention and are covered by the claims. All publications andpatent applications mentioned in the specification are indicative of thelevel of skill of those skilled in the art to which this inventionpertains. All publications and patent applications are hereinincorporated by reference to the same extent as if each individualpublication or patent application was specifically and individuallyindicated to be incorporated by reference. The use of the word “a” or“an” when used in conjunction with the term “comprising” in the claimsand/or the specification may mean “one,” but it is also consistent withthe meaning of “one or more,” “at least one,” and “one or more thanone.” The use of the term “or” in the claims is used to mean “and/or”unless explicitly indicated to refer to alternatives only or thealternatives are mutually exclusive, although the disclosure supports adefinition that refers to only alternatives and “and/or.” Throughoutthis application, the term “about” is used to indicate that a valueincludes the inherent variation of error for the device, the methodbeing employed to determine the value, or the variation that existsamong the study subjects.

Example 1

The B cell cloning technology of the present invention includes threemajor steps—isolation of antigen-specific single B cells or ASCs fromspleen, lymph node and bone marrow with corresponding cell markers;antibody sequence rescue from single cells and expression cassetteamplification; expression of recombinant antibodies in mammalian cells.The flow chart is shown in FIG. 1B. The details of each step aredescribed below

Example 1A: Isolation of Antigen-Specific Single Cells

Antigen-specific cells include the memory/GC cells with membrane-boundantibody and the antibody-secreting plasma cells. A panel of cellsurface markers were used to define and label mouse memory/GC cells(CD19; IgM; IgD; CD38; CD95) (FIG. 2 ). Antigen specific cells werestained using fluorescence-labelled soluble antigens (for example, anysmall molecule fluorophore which can be detected by the cell sortingsystem, such as Alexa-488, Alexa-647, Pacific Blue, R-phycoerythrin,fluorescein isothiocyanate, or allophycocyanin optionally conjugated toa cyanine dye, e.g. Cy7) or cell surface antigens in virus-likeparticles (VLPs). FIG. 2 is an example of labelling the antigen specificmemory/GC cells in OVA immunized mouse spleen. Over 10,000 OVA-specificmemory/GC IgG cells could be sorted from one spleen. Single-cell sortingwas performed using a BD influx flow cytometer equipped with anautomatic cell deposition unit. FACS-sorted cells were deposited into96-well PCR plates with lysis buffer for the next step.

Generally, antigen-specific GC (germinal centre) or memory B cells canbe captured by labelled antigen because they dominantly expresstransmembrane antibodies on the cell surface. On the other hand,plasmablast or plasma cells were thought to be less easily captured bylabelled antigen because of their dominant expression of secretingantibody. We next attempted to isolate ASPCs using fluorescentlylabelled antigen and anti-CD138 to sort the cells from the rest of thecell population using FACS (FIG. 3 ). As shown in FIG. 2 , the majorityof isolated antigen-specific plasma or plasmablast cells but none ofthose left-over same types of cells showed that they wereantigen-specific ASCs in the ELISPOT assay. This demonstrated that thecell sorting method using fluorescence-labelled antigen can efficientlycapture all the antigen-specific ASCs probably with residualtransmembrane antibodies or temporary anchoring secreted antibody on thecell surface.

The fluorescently-labelled antigen can be replaced with VLPs withrecombinant antigen on its surface. The VLPs are generated from CHOcells, KEK cells, MEFs (mouse embryonic fibroblasts) or other mammaliancell lines with co-expression of the recombinant antigen, the retrovirusgag protein, and MA-GFP (gag matrix fragment p15-GFP fusion protein).The gag expression enables VLP budding from cells, and the MA-GFP labelsthe VLPs for fluorescence detection. Both gag and MA-GFP proteins areassociated with the inner surface of the plasma membrane, andrecombinant antigen is on the VLP surface. The antigens on the VLPs arepresented on native form directly expressed from recombinant cellswithout any step of purification or modification. The native form of anantigen should provide all the natural epitopes which greatly helpselection of neutralizing antibodies. The high density of the antigen onthe VLPs increases the signal/noise ratio for detection of cellsexpressing antigen-specific antibodies on the cell surface and greatlyfacilitates the sorting step. The recombinant VLPs can be generated withexpression of different fluorescent proteins such as MA-CFP or MA-YFP.Using multiplexing of VLPs with different antigen and differentfluorescence protein, cells expressing high affinity binders,cross-reactive binders or homolog-specific binders can be selected. Thecells expressing high affinity binders can be selected by cells withrelative high affinity matrix (affinity matrix=the ratio of bindingactivity to low density antigen over to high density antigen VLPs). Thecells expressing cross-reactive binders to orthologs or differentantigens (for 2-in-1 bi-specific antibody isolation) can be selected bycells binding to different types of VLPs at the same time. The cellsexpressing homolog-specific binders also can be selected by cell onlybinding to specific antigen but not its homolog.

Example 1B: High-Throughput Recovery of the Antibody Sequence fromSingle Cells

A rapid, efficient and high-throughput method was developed forgeneration of antibody from individual B cells without any molecularcloning step. The method allowed us to produce Ig-expression constructsfrom amplified variable gene segments of heavy chain and light chainfrom a single cell (FIG. 3 ). Through the whole PCR procedure, heavychain and light chains from a single cell were amplified in the samewell.

The sequences of antibody V regions were recovered by RT-PCR and tworounds of PCRs by the following procedure. Single-cell sorted platesstored at −80° C. were thawed on ice and briefly centrifuged before use.Plates were incubated in the thermal cycler at 65° C. for 5 minutes andindefinitely at 4° C. 6 μL mixture of primers, Superscript III, dNTP,RNase inhibitor and buffer were added to each well and mixed bypipetting. Plates were briefly centrifuged and incubated at 50° C. for60 minutes. Constant region specific primers for the heavy chain andlight chain were used to amplify the variable gene segments from singlecell. Gene-specific reverse primers were used to amplify the kappa,lambda, and gamma chains were gamma RT1, kappa RT1, and lambda RT3.

The first round of PCR was performed with forward V gene-specificprimers with a human cytomegalovirus (hCMV) promoter fragment at the 5′end, and reverse constant region-specific primers. Product from theRT-PCR was used as template for the first PCR. The PCR product comprisesthe variable immunoglobulin region and part of the constant region.Cycling conditions for the first PCR included an initial denaturationstep at 98° C. for 30 minutes, followed by 13 touchdown cycles of 98° C.for 10 minutes, 72° C. to 60° C. for 30 minutes and 72° C. for 30minutes with a drop of 1° C. for each subsequent annealing step; 20cycles of 98° C. for 10 minutes, 60° C. for 30 minutes and 72° C. for 30minutes, followed by a final extension at 72° C. for 2 minutes and heldat 4° C. indefinitely.

In the second round of PCR, a generic forward primer that annealed tothe hCMV tag was used with a reverse nested primer for the constantregion. 1 μL of the products from the first PCR were used as templatesfor the nested second PCR. Cycling conditions for the second PCRincluded an initial denaturation step at 98° C. for 30 minutes, followedby 20 cycles of 98° C. for 10 minutes, 68° C. for 30 minutes, and 72° C.for 30 minutes; a final extension at 72° C. for 2 minutes and held at 4°C. indefinitely. ⅓ of the second round PCR product were run in 1%Agarose gel to check the recovery rate of the antibody sequence fromsingle cells for RT and 2 rounds of PCR steps. Since the primers forheavy and light chain were mixed into the same wells, the PCR productscontain two bands representative for heavy chain VDJ region and lightchain VJ region. The expected sizes of PCR products are ˜700 bp for thekappa and lambda light chains, and ˜500 bp for the gamma heavy chain(FIG. 4 a ).

For antibody expression in the mammalian cells, the amplified productswere then bridged with linear Ig-cassette with 5′ PB LTR-CMV promoterand constant region-polyA signal-3′ PB LTR (FIG. 3 ). The Ig-cassettecontains all essential elements for expression of the antibody,including the CMV promoter, the immunoglobulin chain constant region andthe poly (A) signal. Additionally, the cassette has long overlappingregions of CMV and constant region homology on its ends. 2 μL of theproducts from the second round PCR were used as template for the bridgePCR. Cycling conditions for the bridge PCR were an initial denaturationstep at 98° C. for 30 minutes, followed by 5 cycles of 98° C. for 10minutes, 68° C. for 30 minutes, and 72° C. for 2 minutes; and 25 cyclesof 98° C. for 10 minutes, 60° C. for 30 minutes, and 72° C. for 2minutes; followed by a final extension at 72° C. for 2 minutes and heldat 4° C. indefinitely. ⅓ of the bridge PCR product were run in 1%Agarose gel to check the recovery rate of bridge PCR. The expected sizesof PCR products are ˜2600 bp for the kappa and lambda light chains, and˜3100 bp for the gamma heavy chain (FIG. 4 b ).

The bridge step allows bringing of all the expression elements and PBLTRs together to form the PB transposon with heavy chain and light chainexpression genes. No matter which isotype the mouse antibodies have,mouse IgG1, IgG2a, IgG2b, IgG3 or human IgG1, IgG2, IgG3, IgG4 or anyvariants of constant region can be applied in the bridge step toreformat the Fc. The method applied in this technology does not requireany purification step and can be extensively automated. The overallrecovery rate for the B-cell technology (BCT) through cell sorting andsingle cell PCR is about 38-71% for different cell populations (Table3).

Example 1C: Sequence Analysis by Clusters

The second round PCR products were sent for sequencing. The nucleotidesequences were was determined using an Applied Biosystems 373 DNAsequencer. The sequences were analysed by the Kymab seq-utils program(Lee E. C. et al, Nature Biotechnol., 2014, 32:356-363.). The programpredicts germline sequence and the hypermutation of the analysed IGsequence. The variable immunoglobulin region comprises a VDJ region ofan immunoglobulin nucleotide sequence for heavy genes and a VJ region ofan immunoglobulin nucleotide sequence for Igκ and Igλ. A clonal familyis generally defined by the use of related immunoglobulin heavy chainand/or light chain V(D)J sequences by 2 or more samples. Relatedimmunoglobulin heavy chain V(D)J sequences can be identified by theirshared usage of V(D)J gene segments encoded in the genome (FIG. 5 ).

Within a clonal family, there are generally subfamilies that vary basedon shared mutations within their V(D)J segments, that can arise duringB-cell gene recombination and somatic hypermutation. Clones withdifferent V(D)J segment usage usually exhibit different bindingcharacteristics. Also, clones with the same V(D)J segment usage butdifferent mutations exhibit different binding characteristics. B cellsundergo somatic hypermutation, where random changes in the nucleotidesequences of the antibody genes are made, and B cells whose antibodieshave a higher affinity B cells are selected (FIG. 6 ). If low affinityclones from the same lineage have neutralization function, the potencyusually increases in clones with more mutation to acquire higheraffinity.

Example 1D: Generation of Monoclonal Antibodies from Single Cells

The final PCR step amplified the linear expression cassette encodingheavy chain and light chain. The amplified cassettes for heavy chain andlight chain, and the PB transposase (PBase) expression vector wereco-transfected into mammalian cell line without purification andcloning. Supernatant with transient or stable expression of antibodywere then collected in the corresponding time points. Transfection ofthe conventional expression vector likely causes concatemer integrationinto genome and the integrated gene is subject to being silenced. PBtransposon-mediated expression provides a major advantage for high andstable expression level of transfected genes because multiple copies(10-100) of PB transposons can be transposed and integrated to genomewithin the transcription-active regions, allowing high level expressionof antibody (FIG. 7 ).

Transfections of bridge PCR products and PBase expression vectors fortransient expression were done using LIPOFECTAMINE™ 2000 following themanufacturer's protocol. Transfections were carried out in 96-well deepwell plates. In brief, HEK293 cells were cultured in DMEM+10% ultralowIgG FBS (Invitrogen) to prevent bovine IgG from competing with secretedhuman IgG at the downstream protein A purification step. For 96-wellplate transfections, each well was seeded the day before with 5×10⁵cells in 500 μL of medium, and allowed to grow to 1×10⁶ cells the nextday. 25 μL out of the 30 μL of the bridge PCR products were incubated inOPTIMEM™ media with 100 ng of PBase vector for a final volume of 70 uL,and LIPOFECTAMINE™ 2000 was also separately incubated with 70 uL ofOPTIMEM™ media. Both incubations were for 10 minutes. LIPOFECTAMINE™2000 and the PCR products were then mixed by gentle pipetting andincubated for 15 minutes before adding to HEK293 cells and gently mixed.Culture supernatants were collected on day 8 after transfection forfollowing screenings. The IgG concentration in supernatants containingantibodies of interest is determined by IgG ELISA (FIG. 8 ). Theconcentration of the expressed antibodies is comparable to what wenormally get from the hybridoma technology which is enough to most ofthe downstream screens for the antibody binding ability or functionalassays. The overall hit rates for B cell technology through cell sortingto IgG identification is 36%-71% depending on different cell populations(Table 3).

Example 1E: Antibody-Binding Screening Assay Using LI-COR Odyssey NIRScanning

The expressed antibodies from the HEK293 cell transfection of the B-celltechnology (BCT) were first screened for their ability to bind to theantigen of interest using LI-COR Odyssey NIR scanning, and then positiveclones were screened for their apparent affinity by Surface PlasmonResonance (PROTEON™ XPR36, BioRad) (see below).

B cells producing antigen-specific antibodies were identified byfluorescent screening. Each well of clear 384-well flat-bottom plateswas seeded with 1×10⁴ adherent CHO cells stably transfected with a geneencoding a human transmembrane antigen in 80 μL of F12 medium containing10% (v/v) FBS (1.25×10⁵ cells/mL) using a Multidrop instrument. Cellswere incubated overnight at 37° C. in a CO₂ incubator. The next day theculture media was removed by aspiration and 45 μL of LI-COR IRDYE™ 800CWanti-Mouse antibody added at 500 ng/mL+5 mM DRAQS (LI-COR) diluted1:25,000 in FACS Buffer (PBS+1% BSA+0.1% NaN₃). 5 μL of BCT supernatant,5 μL of control antibody (2 μg/mL) in HEK293 culture medium or 5 μL ofmouse IgG1 control antibody (Sigma, 2 μg/mL) in HEK293 culture mediumwas added using a FluidX liquid handler. Plates were incubated for 1 hrat 4° C. and culture media aspirated. The reaction was stopped and thecells fixed by the addition of 25 μL of 4% paraformaldehyde per well andincubation for 15 minutes at RT. Plates were washed twice with 100 μL ofPBS and the wash solution was removed by blotting on paper towels.Plates were scanned using a Li-Cor Odyssey Classic instrument. Theoverall hit rates for BCT through cell sorting to Antigen specificidentification is 25%-61% depending on different cell populations (Table3).

TABLE 3 Recovery rate of each steps of B-cell technology Recovery rateBone Spleen LN marrow memory/GC memory/GC Spleen Plasma BCT procedurescells cells (RIMMS) Plasmablast cells Single cell PCR 38% 66% 71% 50%Expression & 36% 66% 71% 50% screening IgG Expression & 25% 59% 61% 31%screening Ag- specific binders

Example 1F: Affinity Measurements by SPR Using Antibody Capture Method

Positive clones expressing antigen-specific antibodies were thenscreened for their apparent affinity by Surface Plasmon Resonance.Anti-mouse IgG (GE Healthcare/Biacore) was coupled to the GLM by primaryamine coupling. The GLM chip (BioRad) was activated using NHS/EDAC andthe anti-mouse IgG coupled to this activated surface and then blockedusing 1 M ethanolamine. Immobilization was carried out in either HBS-EP(Teknova) or HBS-N (GE Healthcare/Biacore) at room temperature or 37°C., respectively. The anti-mouse IgG surface on the GLM chip was used todirectly capture antibodies of interest. For kinetic analysis 5concentrations of analyte were used (256 nM, 64 nM, 16 nM, 4 nM and 1nM). For data analysis, the binding sensorgrams were referenced usingthe internal “interspot” referencing unique to the ProteOn XPR36 whichare double referenced using the buffer injection sensorgram. Finally thedata were analyzed using the 1:1 model inherent to the ProteOn XPR36analysis software.

FIG. 9 is an example of SPR data of the antibodies from theovalbumin-immunized Kymouse® using the single B cell technology of theinvention. Around two thirds of antibodies tested showed evidence ofbinding to the antigen, with a diverse range of distinct bindingaffinities and kinetics. The variety of binding characteristics revealsthat the cell sorting procedure is effective in capturing a diversequality of antibodies. Despite the scale of this experiment (twoanimals), many high affinity antibodies in the low-nM to low-pM rangewere isolated verifying the efficiency of affinity maturation.

FIG. 10 shows the apparent affinity of antibodies against two differentKymab target antigens. A range of binders (opened) as well as functionalneutralisers (filled) were detected, with the highest affinity detectedin the picomolar range. This validates the single B cell cloningtechnology of the invention to be a powerful tool in the identificationand retrieval of high affinity and functionally competent antibodies.

As used in this specification and claim(s), the words “comprising” (andany form of comprising, such as “comprise” and “comprises”), “having”(and any form of having, such as “have” and “has”), “including” (and anyform of including, such as “includes” and “include”) or “containing”(and any form of containing, such as “contains” and “contain”) areinclusive or open-ended and do not exclude additional, unrecitedelements or method steps.

The term “or combinations thereof” as used herein refers to allpermutations and combinations of the listed items preceding the term.For example, “A, B, C, or combinations thereof is intended to include atleast one of: A, B, C, AB, AC, BC, or ABC, and if order is important ina particular context, also BA, CA, CB, CBA, BCA, ACB, BAC, or CAB.Continuing with this example, expressly included are combinations thatcontain repeats of one or more item or term, such as BB, MA, MB, BBC,AAABCCCC, CBBAAA, CABABB, and so forth. The skilled artisan willunderstand that typically there is no limit on the number of items orterms in any combination, unless otherwise apparent from the context.

Any part of this disclosure may be read in combination with any otherpart of the disclosure, unless otherwise apparent from the context.

All of the compositions and/or methods disclosed and claimed herein canbe made and executed without undue experimentation in light of thepresent disclosure. While the compositions and methods of this inventionhave been described in terms of preferred embodiments, it will beapparent to those of skill in the art that variations may be applied tothe compositions and/or methods and in the steps or in the sequence ofsteps of the method described herein without departing from the concept,spirit and scope of the invention. All such similar substitutes andmodifications apparent to those skilled in the art are deemed to bewithin the spirit, scope and concept of the invention as defined by theappended claims.

TABLE 1 Nucleotide Sequences/Nucleic Acids: V_(H) Oligos SEQ ID NO: 1TCTAGAGAAAACCCTGTGAGCACAGCTC SEQ ID NO: 2 GAGAATCCCCTGAGAGCTCCGTTCSEQ ID NO: 3 TCAGAAGCCCCCAGAGCACAACGC SEQ ID NO: 4TGGGAGAATCCCCTAGATCACAGCTC SEQ ID NO: 5 ACAGAAGCCCCCAGAGCGCAGCACSEQ ID NO: 6 CCCACCATGGACACACTTTGCTCC SEQ ID NO: 7TGGACTCCAAGGCCTTTCCACTTGG SEQ ID NO: 8 TGGACCTCCTGCACAAGAACATGAAACACSEQ ID NO: 9 GCAGTCACCAGAGCTCCAGACAATGTC SEQ ID NO: 10AAGAAGAAGCCCCTAGACCACAGCTCCAC SEQ ID NO: 11 TGAGATTCCCAGGTGTTTCCATTCAGSEQ ID NO: 12 AGAGCCCCAGCCCCAGAATTCCCAGGAG SEQ ID NO: 13TTCAGTGATCAGGACTGAACACACA SEQ ID NO: 14 CCCCAGCCTTGGGATTCCCAAGTGTTTTCSEQ ID NO: 15 TGAGATTCCCACGTGTTTCCATTCAG SEQ ID NO: 16ACTTGGTGATCAGCACGGAGCACCGA SEQ ID NO: 17 CTGGGATTTTCAGGTGTTTTCATTTGGV_(K) Oligos SEQ ID NO: 18 GGAGTCAGACCCAGTCAGGACACAGC SEQ ID NO: 18GGAGTCAGACCCACTCAGGACACAGC SEQ ID NO: 20 GGAATCAGTCCCACTCAGGACACAGCSEQ ID NO: 21 GGAGTCAGTCTCAGTCAGGACACAGC SEQ ID NO: 22ATCAGGACTCCTCAGTTCACCTTCTCAC SEQ ID NO: 23 ATTAGGACTCCTCAGGTCACCTTCTCACSEQ ID NO: 24 GAGGAACTGCTCAGTTAGGACCCAGA SEQ ID NO: 25GCTACAACAGGCAGGCAGGGGCAGC SEQ ID NO: 26 GACTACCACCTGCAGGTCAGGGCCAAGV_(L) Oligos SEQ ID NO: 27 atggcctggtctcctctcctc SEQ ID NO: 28atggccggcttccctctcctc SEQ ID NO: 29 atgccctgggctctgctcctc SEQ ID NO: 30atgccctgggtcatgctcctc SEQ ID NO: 31 atggcctgggctctgctgctc SEQ ID NO: 32atggcatggatccctctcttc SEQ ID NO: 33 atggcctggacccctctcctg SEQ ID NO: 34atggcctggacccctctcctc SEQ ID NO: 35 atggcctggacccctctctgg SEQ ID NO: 36atggcctggaccgttctcctc SEQ ID NO: 37 atggcatgggccacactcctg SEQ ID NO: 38atggcctggatccctctactt SEQ ID NO: 39 atggcctggatccctctcctg SEQ ID NO: 40atggcctggaccgctctcctt SEQ ID NO: 41 atggcctgggtctccttctac SEQ ID NO: 42atggcctggaccccactcctc SEQ ID NO: 43 atggcttggaccccactcctc SEQ ID NO: 44atggcctggactcctctcctc SEQ ID NO: 45 atggcctggactcctctcttt SEQ ID NO: 46atggcctggatgatgcttctc SEQ ID NO: 47 atggcctgggctcctctgctcC Region Oligos Oligo Sequence (5′ to 3′) SEQ ID NO: 48 C_(H)1gctcttgcggTAGCCCTTGA CCAGGCATCC SEQ ID NO: 49 C_(H)2CAGATCCAGGGGCCAGTGGA TAGAC SEQ ID NO: 50 C_(K)1 gtttctgatcgaaCTAACACTCATTCCTGTTGAAG SEQ ID NO: 51 C_(K)2 GACAATGGGTGAAGTTGATG TCTTGTGAGSEQ ID NO: 52 C_(L)1 cgacaaccactacctCTATG AACATTCTGTAGGGGC SEQ ID NO: 53C_(L)2 CTTCTCCACGGTGCTCCCTT CATGC

TABLE 2 Y (i.e., sequence X used to X copy or modify gene segment Y)V_(H) Oligos SEQ ID NO: 1 IGHV1-8 (e.g., IGHV1-8*01) SEQ ID NO: 2IGHV1-2 (e.g., IGHV1-2*04) SEQ ID NO: 3 IGHV1-3*01 (e.g., IGHV1-3*01)SEQ ID NO: 4 IGHV1-18 (e.g., IGHV1-18*01) SEQ ID NO: 5 IGHV1-24 (e.g.,IGHV1-24*01) SEQ ID NO: 6 IGHV2-5 (e.g., IGHV2-5*10) and/or IGHV2-26(e.g., IGHV2-26*01) SEQ ID NO: 7 IGHV3-7 (e.g., IGHV3-7*01) SEQ ID NO: 8IGHV4-4 (e.g., IGHV4-4*02) SEQ ID NO: 9 IGHV6-1 (e.g., IGHV6-1*01) SEQID NO: 10 IGHV7-4-1 (e.g., IGHV7-4-1*01) SEQ ID NO: 11 IGHV3-9 (e.g.,IGHV3-9*01) SEQ ID NO: 12 IGHV3-11 (e.g., IGHV3-11*01) SEQ ID NO: 13IGHV3-13 (e.g., IGHV3-13*01) SEQ ID NO: 14 IGHV3-15 (e.g., IGHV3-15*01)SEQ ID NO: 15 IGHV3-20 (e.g., IGHV3-20*01) SEQ ID NO: 16 IGHV3-21 (e.g.,IGHV3-21*01) SEQ ID NO: 17 IGHV3-23 (e.g., IGHV3-23*01) VK Oligos SEQ IDNO: 18 One, more or all of IGKV1-5, 1-12, 1-8, 1D-8, 1D-43, 1D-16, 1D-9SEQ ID NO: 19 One, more or all of IGKV1-6,1-13,1D-12,1D-13 SEQ ID NO: 20IGKV1-17 and/or 1D-17 SEQ ID NO: 21 One, more or all of IGKV1-27,1-33,1D-39 SEQ ID NO: 22 One, more or all of IGKV2-28, 2-30, 2D-40 SEQID NO: 23 IgKV2-24 SEQ ID NO: 24 IGKV3 Family SEQ ID NO: 25 IGKV4-1 SEQID NO: 26 IGKV5-2 VL Oligos SEQ ID NO: 27 IGLV1-40 SEQ ID NO: 28IGLV1-47 SEQ ID NO: 29 IGLV10-54 SEQ ID NO: 30 IGLV2-23 SEQ ID NO: 31IGLV3-1 SEQ ID NO: 32 IGLV3-10 SEQ ID NO: 33 IGLV3-12 SEQ ID NO: 34IGLV3-19 SEQ ID NO: 35 IGLV3-21 SEQ ID NO: 36 IGLV3-22 SEQ ID NO: 37IGLV3-25 SEQ ID NO: 38 IGLV3-16 SEQ ID NO: 39 IGLV3-9 SEQ ID NO: 40IGLV4-3 SEQ ID NO: 41 IGLV3-2 SEQ ID NO: 42 IGLV5-45 SEQ ID NO: 43IGLV7-43 SEQ ID NO: 44 IGLV9-49 SEQ ID NO: 45 IGLV1-40 SEQ ID NO: 46IGLV1-47 SEQ ID NO: 47 IGLV10-54

What is claimed herein is:
 1. A kit comprising one or more mixturesselected from a) two or more PCR primers, each primer comprising atleast 15 contiguous nucleotides of a human antibody variable genesegment UTR sequence for hybridising to the 5′UTR sequence of anantibody variable gene segment selected from: IGHV1-8; IGHV1-2;IGHV1-3*01; IGHV1-18; IGHV1-24; IGHV2-5 and/or IGHV2-26; IGHV3-7;IGHV4-4; IGHV6-1; IGHV7-4-1; IGHV3-9; IGHV3-11; IGHV3-13; IGHV3-15;IGHV3-20; IGHV3-21; IGHV3-23; one, more or all of IGKV1-5, 1-12, 1-8,ID-8, ID-43, ID-16, ID-9; one, more or all of IGKV1-6, 1-13, ID-12,ID-13; IGKV1-17 and/or ID-17; one, more or all of IGKV1-27, 1-33, ID-39;one, more or all of IGKV2-28, 2-30, 2D-40; IqKV2-24; IGKV3 Family;IGKV4-1; IGKV5-2; IGLV1-40; IGLV1-47; IGLV10-54; IGLV2-23; IGLV3-1;IGLV3-10; IGLV3-12; IGLV3-19; IGLV3-21; IGLV3-22; IGLV3-25; IGLV3-16;IGLV3-9; IGLV4-3; IGLV3-2; IGLV5-45; IGLV7-43; IGLV9-49; IGLV1-40;IGLV1-47 and IGLV10-54, for performing PCR to copy the gene segment; b)a PCR primer mixture comprising at least a first isolated PCR primer anda second isolated PCR primer, wherein the PCR primers are different andeach comprises a sequence that is at least 90% identical to a sequenceselected from the group consisting of SEQ ID NOs: 1-47; and c) a PCRprimer mixture comprising at least a first isolated PCR primer and asecond isolated PCR primer, wherein the first PCR primer is capable ofhybridising to a human antibody variable region gene segment 5′UTRsequence of a gene comprised by a target nucleic acid; wherein thevariable region gene segment encodes a human V region; and the secondPCR primer is capable of hybridising to a second sequence; wherein thesecond sequence is comprised by the target nucleic acid and is 3′ to theUTR sequence; wherein the first isolated PCR primer comprises a sequencethat is at least 90% identical to a sequence selected from the groupconsisting of SEQ ID NOs: 1-47; and wherein the second isolated PCRprimer comprises an antibody constant region sequence.
 2. The kitaccording to claim 1, wherein the kit comprises a. a mixture of two ormore PCR primers, wherein all of the PCR primers copy VH gene segmentsand comprise at least 15 contiguous nucleotides of SEQ ID NOs: 1-17; orb. a first group of PCR primers comprising the first and second isolatedPCR primers, and wherein all of the PCR primers in the first group copyVH gene segments and the PCR primer sequences comprise a sequence whichis at least 90% identical to the sequences of SEQ ID NOs: 1-17; or c. afirst group of PCR primers comprising the first isolated PCR primer, andwherein all of the PCR primers in the first group copy VH gene segmentsand the PCR primer sequences comprise a sequence which is at least 90%identical to the sequences of SEQ ID NOs: 1-17.
 3. The kit according toclaim 1, wherein the kit comprises a. a mixture of two or more PCRprimers, wherein all of the PCR primers copy Vκ gene segments andcomprise at least 15 contiguous nucleotides of SEQ ID NOs: 18-26; or b.a first group of PCR primers comprising the first and second isolatedPCR primers, and wherein all of the PCR primers in the first group copyVκ gene segments and the PCR primer sequences comprise a sequence whichis at least 90% identical to the sequences of SEQ ID NOs: 18-26; or c. afirst group of PCR primers comprising the first isolated PCR primer, andwherein all of the PCR primers in the first group copy Vκ gene segmentsand the PCR primer sequences comprise a sequence which is at least 90%identical to the sequences of SEQ ID NOs: 18-26.
 4. The kit according toclaim 1, wherein the kit comprises a. a mixture of two or more PCRprimers, wherein all of the PCR primers copy Vλ gene segments andcomprise at least 15 contiguous nucleotides of SEQ ID NOs: 27-47; or b.a first group of PCR primers comprising the first and second isolatedPCR primers, and wherein all of the PCR primers in the first group copyVλ gene segments and the PCR primer sequences comprise a sequence whichis at least 90% identical to the sequences of SEQ ID NOs: 27-47; or c. afirst group of PCR primers comprising the first isolated PCR primer, andwherein all of the PCR primers in the first group copy Vλ gene segmentsand the PCR primer sequences comprise a sequence which is at least 90%identical to the sequences of SEQ ID NOs: 27-47.
 5. The kit according toclaim 1, wherein the kit comprises a. a mixture of two or more PCRprimers, wherein all of the PCR primers copy VH gene segments and Vκgene segments and comprise at least 15 contiguous nucleotides of SEQ IDNOs: 1-26; or b. a first group of PCR primers comprising the first andsecond isolated PCR primers, and wherein all of the PCR primers in thefirst group copy VH gene segments and Vκ gene segments and the PCRprimer sequences comprise a sequence which is at least 90% identical tothe sequences of SEQ ID NOs: 1-26; or c. a first group of PCR primerscomprising the first isolated PCR primer, and wherein all of the PCRprimers in the first group copy VH gene segments and Vκ gene segmentsand the PCR primer sequences comprise a sequence which is at least 90%identical to the sequences of SEQ ID NOs: 1-26.
 6. The kit according toclaim 1, wherein the kit comprises a. a mixture of at least 3 PCRprimers which copy VH gene segments and 3 PCR primers which copy Vκ genesegments, wherein each PCR primer comprises at least 15 contiguousnucleotides of a sequence selected from SEQ ID NOs: 1-26; or b. whereinthe mixture comprises at least 3 PCR primers which copy VH gene segmentsand at least 3 PCR primers which copy Vκ gene segments and wherein eachPCR primer comprises a sequence which is at least 90% identical to asequence selected from the SEQ ID NOs: 1-26; or c. wherein the mixturecomprises at least 3 PCR primers which copy VH gene segments and atleast 3 PCR primers which copy Vκ gene segments and wherein each PCRprimer comprises a sequence which is at least 90% identical to asequence selected from SEQ ID NOs: 1-26.
 7. The kit according to claim1, wherein the kit comprises a. a mixture of two or more PCR primers,wherein all of the PCR primers copy VH gene segments and Vκ genesegments and comprise at least 15 contiguous nucleotides of each of SEQID NOs: 1-26; or b. a first group of at least 26 different PCR primerscomprising the first and second isolated PCR primers, wherein all of thePCR primers in the first group copy VH gene segments and Vκ genesegments and wherein each of the 26 PCR primers consists of a sequencewhich is at least 90% identical to a sequence selected from SEQ ID Nos:1-26, such that a sequence at least 90% identical to each of thesequences of SEQ ID NOs: 1-26 is present in the first group; or c. afirst group of at least 26 different PCR primers comprising the firstisolated PCR primer, wherein each of the PCR primers in the first groupcopy VH gene segments and Vκ gene segments and wherein all of the 26 PCRprimers consists of a sequence which is at least 90% identical to asequence selected from SEQ ID NOs: 1-26, such that a sequence at least90% identical to each of the sequences of SEQ ID NOs: 1-26 is present inthe first group.
 8. The kit according to claim 1, wherein the kitcomprises a. a mixture of two or more PCR primers, wherein all of thePCR primers copy VH gene segments and Vκ gene segments and comprise atleast 20 contiguous nucleotides of each of SEQ ID NOs: 1-26; or b. afirst group of at least 26 different PCR primers comprising the firstand second isolated PCR primers, wherein all of the PCR primers in thefirst group copy VH gene segments and Vκ gene segments and wherein eachof the 26 PCR primers consists of a sequence which is at least 95% or100% identical to a sequence selected from SEQ ID Nos: 1-26, such that asequence at least 95% or 100% identical to each of the sequences of SEQID NOs: 1-26 is present in the first group; or c. a first group of atleast 26 different PCR primers comprising the first isolated PCR primer,wherein all of the PCR primers in the first group copy VH gene segmentsand Vκ gene segments and wherein each of the 26 PCR primers consists ofa sequence which is at least 95% or 100% identical to a sequenceselected from SEQ ID NOs: 1-26, such that a sequence at least 95% or100% identical to each of the sequences of SEQ ID NOs: 1-26 is presentin the first group.
 9. The kit according to claim 1, wherein the kitcomprises a. a mixture of two or more PCR primers, wherein all of thePCR primers copy VH gene segments and Vλ gene segments and comprise atleast 15 contiguous nucleotides of SEQ ID NOs: 1-17 and 27-47; or b. afirst group of PCR primers comprising the first and second isolated PCRprimers, and wherein all of the PCR primers in the first group copy VHgene segments and Vλ gene segments and the PCR primer sequences comprisea sequence which is at least 90% identical to the sequences of SEQ IDNOs: 1-17 and 27-47; or c. a first group of PCR primers comprising thefirst isolated PCR primer, and wherein all of the PCR primers in thefirst group copy VH gene segments and Vλ gene segments and the PCRprimer sequences comprise a sequence which is at least 90% identical tothe sequences of SEQ ID NOs: 1-17 and 27-47.
 10. The kit according toclaim 1, wherein the kit comprises a. a mixture of at least at least 3PCR primers which copy VH gene segments and at least 3 PCR primers whichcopy Vλ gene segments and each PCR primer comprises at least 15contiguous nucleotides of a sequence selected from SEQ ID NOs: 1-17 and27-47; or b. wherein the mixture comprises at least 3 PCR primers copyVH gene segments and at least 3 PCR primers which copy Vλ gene segmentsand wherein each PCR primer sequence comprises a sequence which is atleast 90% identical to a sequence selected from SEQ ID NOs: 1-17 and27-47; or c. wherein the mixture comprises least 3 PCR primers whichcopy VH gene segments and at least 3 PCR primers which copy Vλ genesegments and wherein each PCR primer comprises a sequence which is atleast 90% identical to a sequence selected from SEQ ID NOs: 1-17 and27-47.
 11. The kit according to claim 1, wherein the kit comprises a. amixture of two or more PCR primers, wherein all of the PCR primers copyVH gene segments and Vλ gene segments and comprise at least 15contiguous nucleotides of each of SEQ ID NOs: 1-17 and 27-47; or b. afirst group of at least 38 different PCR primers comprising the firstand second isolated PCR primers, wherein all of the PCR primers in thefirst group copy VH gene segments and Vλ gene segments and wherein eachof the 38 different PCR primers consist of a sequence which is at least90% identical to a sequence selected from SEQ ID NOs: 1-17 and 27-47such that a sequence at least 90% identical to each of the sequences ofSEQ ID NOs: 1-17 and 27-47 is present in the first group; or c. a firstgroup of at least 38 different PCR primers comprising the first isolatedPCR primer, wherein all of the PCR primers in the first group copy VHgene segments and Vλ gene segments and wherein each of the 38 differentPCR primers consist of a sequence which is at least 90% identical to asequence selected from SEQ ID NOs: 1-17 and 27-47 such that a sequenceat least 90% identical to each of the sequences of SEQ ID NOs: 1-17 and27-47 is present in the first group.
 12. The kit according to claim 1,wherein the kit comprises a. a mixture of two or more PCR primers,wherein all of the PCR primers copy VH gene segments and Vλ genesegments and comprise at least 20 contiguous nucleotides of each of SEQID NOs: 1-17 and 27-47; or b. a first group of at least 38 different PCRprimers comprising the first and second isolated PCR primers, whereinall of the PCR primers in the first group copy VH gene segments and Vλgene segments and wherein each of the 38 different PCR primers consistof a sequence which is at least 95% or 100% identical to a sequenceselected from SEQ ID NOs: 1-17 and 27-47 such that a sequence at least95% or 100% identical to each of the sequences of SEQ ID NOs: 1-17 and27-47 is present in the first group; or c. a first group of at least 38different PCR primers comprising the first isolated PCR primer, whereinall of the PCR primers in the first group copy VH gene segments and Vλgene segments and wherein each of the 38 different PCR primers consistof a sequence which is at least 95% or 100% identical to a sequenceselected from SEQ ID NOs: 1-17 and 27-47 such that a sequence at least95% or 100% identical to each of the sequences of SEQ ID NOs: 1-17 and27-47 is present in the first group.
 13. The kit according to claim 1,wherein the kit comprises a. a mixture of two or more PCR primers,wherein all of the PCR primers copy VH gene segments, Vκ gene segmentsand Vλ gene segments and comprise at least 15 contiguous nucleotides ofSEQ ID NOs: 1-47; or b. a first group of PCR primers comprising thefirst and second isolated PCR primers, and wherein all of the PCRprimers in the first group copy VH gene segments, Vκ gene segments andVλ gene segments and the PCR primer sequences comprise a sequence whichis at least 90% identical to the sequences of SEQ ID NOs: 1-47; or c. afirst group of PCR primers comprising the first isolated PCR primer, andwherein all of the PCR primers in the first group copy VH gene segments,Vκ gene segments and Vλ gene segments and the PCR primer sequencescomprise a sequence which is at least 90% identical to the sequences ofSEQ ID NOs: 1-47.
 14. The kit according to claim 1, wherein the kitcomprises a. a mixture of at least at least 3 PCR primers which copy VHgene segments, at least 3 PCR primers which copy Vκ gene segments and atleast 3 PCR primers which copy Vλ gene segments and wherein each PCRprimer comprises at least 15 contiguous nucleotides of a sequenceselected from SEQ ID NOs: 1-47; or b. wherein the mixture comprises atleast 3 PCR primers which copy VH gene segments, at least 3 PCR primerswhich copy Vκ gene segments and at least 3 PCR primers which copy Vλgene segments and wherein each PCR primer comprises a sequence which isat least 90% identical to a sequence selected from SEQ ID NOs: 1-47; orc. wherein the mixture comprises at least 3 PCR primers which copy VHgene segments, at least 3 PCR primers which copy Vκ gene segments and atleast 3 PCR primers which copy Vλ gene segments and wherein each PCRprimer comprises a sequence which is at least 90% identical to asequence selected from SEQ ID NOs: 1-47.
 15. The kit according to claim1, wherein the kit comprises a. a mixture of two or more PCR primers,wherein all of the PCR primers copy VH gene segments, Vκ gene segmentsand Vλ gene segments and comprise at least 15 contiguous nucleotides ofeach of SEQ ID NOs: 1-47; or b. a first group of at least 48 differentPCR primers comprising the first and second isolated PCR primers,wherein all of the PCR primers in the first group copy VH gene segments,Vκ gene segments and Vλ gene segments and wherein each of the 48different PCR primers consist of a sequence which is at least 90%identical to a sequence selected from SEQ ID NOs: 1-47, such that asequence at least 90% identical to each of the sequences of SEQ ID NOs:1-47 is present in the first group; or c. a first group of at least 48different PCR primers comprising the first isolated PCR primer, whereinall of the PCR primers in the first group copy VH gene segments, Vκ genesegments and Vλ gene segments and wherein each of the 48 different PCRprimers consist of a sequence which is at least 90% identical to asequence selected from SEQ ID NOs: 1-47, such that a sequence which isat least 90% identical to each of the sequences of SEQ ID NOs: 1-47 ispresent in the first group.
 16. The kit according to claim 1, whereinthe kit comprises a. a mixture of two or more PCR primers, wherein allof the primers copy VH gene segments, Vκ gene segments and Vλ genesegments and comprise at least 20 contiguous nucleotides of each of SEQID NOs: 1-47; or b. a first group of at least 48 different PCR primerscomprising the first and second isolated PCR primers, wherein all of thePCR primers in the first group copy VH gene segments, Vκ gene segmentsand Vλ gene segments and wherein each of the 48 different PCR primersconsist of a sequence which is at least 95% or 100% identical to asequence selected from SEQ ID NOs: 1-47, such that a sequence at least95% or 100% identical to each of the sequences of SEQ ID NOs: 1-47 ispresent in the first group; or c. a first group of at least 48 differentPCR primers comprising the first isolated PCR primer, wherein all of thePCR primers in the first group copy VH gene segments, Vκ gene segmentsand Vλ gene segments and wherein each of the 48 different PCR primersconsist of a sequence which is at least 95% or 100% identical to asequence selected from SEQ ID NOs: 1-47, such that a sequence which isat least 95% or 100% identical to each of the sequences of SEQ ID NOs:1-47 is present in the first group.
 17. The kit according to claim 1b,wherein the first PCR primer is capable of hybridising to a humanantibody variable region gene segment 5′UTR sequence of a gene comprisedby a target nucleic acid; wherein the gene encodes a human V regionand/or the second PCR primer is capable of hybridising to a secondsequence; and wherein the second sequence is comprised by the targetnucleic acid and is 3′ to the UTR sequence.
 18. The kit according toclaim 1, wherein the mixture further comprises at least one furtherisolated PCR primer, wherein the further PCR primer comprises at leastone of: a. an antibody heavy chain constant region sequence; b. anantibody kappa chain constant region sequence; and c. an antibody lambdachain constant region sequence.
 19. The kit according to claim 18,wherein the further PCR primer comprises at least one of: a. an antibodyheavy chain constant region sequence at least 90% identical to SEQ ID NO48 or 49; b. an antibody kappa chain constant region sequence at least90% identical to SEQ ID NO: 50 or 51; and c. an antibody lambda chainconstant region sequence at least 90% identical to SEQ ID NO: 52 or 53.20. The kit according to claim 18, wherein the further PCR primercomprises at least one of: a. an antibody heavy chain constant regionsequence at least 95% or 100% identical to SEQ ID NO 48 or 49; b. anantibody kappa chain constant region sequence at least 95% or 100%identical to SEQ ID NO: 50 or 51; and c. an antibody lambda chainconstant region sequence at least 95% or 100% identical to SEQ ID NO: 52or
 53. 21. The kit according to claim 1c, wherein the first PCR primercomprises a promoter nucleotide sequence 5′ of the UTR sequence which isa human cytomegalovirus promoter fragment.